Scholars@Duke publication: AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

Publication , Journal Article

Conitzer, V; Sandholm, T

Published in: Machine Learning

May 1, 2007

Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action (repeated) games-assuming that the opponent's mixed strategy is observable. Another algorithm, ReDVaLeR (which was introduced after the algorithm described in this paper), achieves the two properties in games with arbitrary numbers of actions and players, but still requires that the opponents' mixed strategies are observable. In this paper we present AWESOME, the first algorithm that is guaranteed to have the two properties in games with arbitrary numbers of actions and players. It is still the only algorithm that does so while only relying on observing the other players' actual actions (not their mixed strategies). It also learns to play optimally against opponents that eventually become stationary. The basic idea behind AWESOME (Adapt When Everybody is Stationary, Otherwise Move to Equilibrium) is to try to adapt to the others' strategies when they appear stationary, but otherwise to retreat to a precomputed equilibrium strategy. We provide experimental results that suggest that AWESOME converges fast in practice. The techniques used to prove the properties of AWESOME are fundamentally different from those used for previous algorithms, and may help in analyzing future multiagent learning algorithms as well. © Springer Science + Business Media, LLC 2007.

Duke Scholars

Author Vincent Conitzer Computer Science

Published In

Machine Learning

DOI

10.1007/s10994-006-0143-1

EISSN

1573-0565

ISSN

0885-6125

Publication Date

May 1, 2007

Volume

Issue

1-2

Start / End Page

23 / 43

Related Subject Headings

Artificial Intelligence & Image Processing
4611 Machine learning
1702 Cognitive Sciences
0806 Information Systems
0801 Artificial Intelligence and Image Processing

Citation

APA

Chicago

ICMJE

MLA

NLM

Conitzer, V., & Sandholm, T. (2007). AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. Machine Learning, 67(1–2), 23–43. https://doi.org/10.1007/s10994-006-0143-1

Conitzer, V., and T. Sandholm. “AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents.” Machine Learning 67, no. 1–2 (May 1, 2007): 23–43. https://doi.org/10.1007/s10994-006-0143-1.

Conitzer V, Sandholm T. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. Machine Learning. 2007 May 1;67(1–2):23–43.

Conitzer, V., and T. Sandholm. “AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents.” Machine Learning, vol. 67, no. 1–2, May 2007, pp. 23–43. Scopus, doi:10.1007/s10994-006-0143-1.

Published In

Machine Learning

DOI

10.1007/s10994-006-0143-1

EISSN

1573-0565

ISSN

0885-6125

Publication Date

May 1, 2007

Volume

Issue

1-2

Start / End Page

23 / 43

Related Subject Headings

Artificial Intelligence & Image Processing
4611 Machine learning
1702 Cognitive Sciences
0806 Information Systems
0801 Artificial Intelligence and Image Processing