Tug-of-war Model for Competitive Multi-armed Bandit Problem: Amoeba-inspired Algorithm for Cognitive Medium Access

Song-Ju Kim; Masashi Aono; Etsushi Nameda; Masahiko Hara

Summary

Proceedings of the 2012 International Symposium on Nonlinear Theory and its Applications

2012

Session Number:C1L-C

Session:

Number:590

Tug-of-war Model for Competitive Multi-armed Bandit Problem: Amoeba-inspired Algorithm for Cognitive Medium Access

Song-Ju Kim, Masashi Aono, Etsushi Nameda, Masahiko Hara,

pp.590-593

Publication Date:

Online ISSN:2188-5079

DOI:10.15248/proc.1.590

PDF download (762.4KB)

Summary:

The “tug-of-war (TOW) model” is a unique parallel search algorithm for solving the multi-armed bandit problem (BP), which was inspired by the photoavoidance behavior of a single-celled amoeboid organism, the true slime mold Physarum polycephalum [1, 2, 3, 4, 5, 6]. “The cognitive medium access”, which refers to multiuser channel allocations in cognitive radio, can be interpreted as “competitive multi-armed bandit problem (CBP) [14].” Unlike the normal BP, the reward (free channel) probability of a channel selected by more than one user is evenly split between selecting users. In this study, we propose the “solid TOW (STOW) model” for the CBP toward developing cognitive medium access protocols in uncertain environments. The aim of this study is to explore how can the users achieve the “social maximum”, which is the most desirable state to obtain the maximum total score, in a decentralized manner. We show that the performance of the STOW model is higher than that of the well-known UCB1-tuned algorithm in many cases.

References:

[1] S. -J. Kim, M. Aono, M. Hara, UC2009, LNCS 5715, Springer, p.289, 2009.

[2] S. -J. Kim, M. Aono, M. Hara, UC2010, LNCS 6079, Springer, pp.69-80, 2010.

[3] S. -J. Kim, M. Aono, M. Hara, BioSystems vol.101, pp.29-36, 2010.

[4] S. -J. Kim, M. Aono, M. Hara, Proc. of NOLTA2010, pp.520-523, 2010.

[5] S. -J. Kim, E. Nameda, M. Aono, M. Hara, Proc. of NOLTA2011, pp.176-179, 2011.

[6] S. -J. Kim, M. Aono, E. Nameda, M. Hara, Technical Report of IEICE (CCS-2011-025), pp.36-41, [in Japanese], 2011.

[7] M. Dorigo, L. M. Gambardella, Artificial Life vol.5 no. 2, pp.137-172, 1999.

[8] D. Karaboga, Technical Report-TR06, Erciyes University, 2005.

[9] M. Aono, Y. Hirata, M. Hara, K. Aihara, New Generation Computing vol.27, pp.129-157, 2009.

[10] M. Aono, Y. Hirata, M. Hara, K. Aihara, UC2009, LNCS 5715, Springer, pp.56-69, 2009.

[11] P. Auer, N. Cesa-Bianchi, P. Fischer, Machine Learning vol.47, pp.235-256, 2002.

[12] L. Kocsis, C. Szepesvári, ECML2006, LNAI 4212, Springer, pp.282-293, 2006.

[13] S. Gelly, Y. Wang, R. Munos, O. Teytaud, RR-6062-INRIA, pp.1-19, 2006.

[14] L. Lai, H. Jiang, H. V. Poor, Proc. of IEEE 42nd Asilomar Conference on Signals, System and Computers, pp.98-102, 2008.

[15] L. Lai, H. E. Gamal, H. Jiang, H. V. Poor, IEEE Trans. on Mobile Computing, vol.10 no.2, pp.239-253, 2011.

[16] T. Roughgarden, “Selfish routing and the price of anarchy”, The MIT Press, Cambridge, (2005).