Skip to main content

Learning in Zero-Sum Team Markov Games Using Factored Value Functions

Publication ,  Conference
Lagoudakis, MG; Parr, R
Published in: NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems
January 1, 2002

We present a new method for learning good strategies in zero-sum Markov games in which each side is composed of multiple agents collaborating against an opposing team of agents. Our method requires full observability and communication during learning, but the learned policies can be executed in a distributed manner. The value function is represented as a factored linear architecture and its structure determines the necessary computational resources and communication bandwidth. This approach permits a tradeoff between simple representations with little or no communication between agents and complex, computationally intensive representations with extensive coordination between agents. Thus, we provide a principled means of using approximation to combat the exponential blowup in the joint action space of the participants. The approach is demonstrated with an example that shows the efficiency gains over naive enumeration.

Duke Scholars

Published In

NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems

Publication Date

January 1, 2002

Start / End Page

1627 / 1634
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Lagoudakis, M. G., & Parr, R. (2002). Learning in Zero-Sum Team Markov Games Using Factored Value Functions. In NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems (pp. 1627–1634).
Lagoudakis, M. G., and R. Parr. “Learning in Zero-Sum Team Markov Games Using Factored Value Functions.” In NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems, 1627–34, 2002.
Lagoudakis MG, Parr R. Learning in Zero-Sum Team Markov Games Using Factored Value Functions. In: NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2002. p. 1627–34.
Lagoudakis, M. G., and R. Parr. “Learning in Zero-Sum Team Markov Games Using Factored Value Functions.” NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems, 2002, pp. 1627–34.
Lagoudakis MG, Parr R. Learning in Zero-Sum Team Markov Games Using Factored Value Functions. NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2002. p. 1627–1634.

Published In

NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems

Publication Date

January 1, 2002

Start / End Page

1627 / 1634