Skip to main content

Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning

Publication ,  Conference
Hsu, HL; Wang, W; Pajic, M; Xu, P
Published in: Advances in Neural Information Processing Systems
January 1, 2024

We present the first study on provably efficient randomized exploration in cooperative multi-agent reinforcement learning (MARL). We propose a unified algorithm framework for randomized exploration in parallel Markov Decision Processes (MDPs), and two Thompson Sampling (TS)-type algorithms, CoopTS-PHE and CoopTS-LMC, incorporating the perturbed-history exploration (PHE) strategy and the Langevin Monte Carlo exploration (LMC) strategy respectively, which are flexible in design and easy to implement in practice. For a special class of parallel MDPs where the transition is (approximately) linear, we theoretically prove that both CoopTS-PHE and CoopTS-LMC achieve a Oe(d3/2H2 √MK) regret bound with communication complexity Oe(dHM2), where d is the feature dimension, H is the horizon length, M is the number of agents, and K is the number of episodes. This is the first theoretical result for randomized exploration in cooperative MARL. We evaluate our proposed method on multiple parallel RL environments, including a deep exploration problem (i.e., N-chain), a video game, and a real-world problem in energy systems. Our experimental results support that our framework can achieve better performance, even under conditions of misspecified transition models. Additionally, we establish a connection between our unified framework and the practical application of federated learning.

Duke Scholars

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2024

Volume

37

Related Subject Headings

  • 4611 Machine learning
  • 1702 Cognitive Sciences
  • 1701 Psychology
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Hsu, H. L., Wang, W., Pajic, M., & Xu, P. (2024). Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning. In Advances in Neural Information Processing Systems (Vol. 37).
Hsu, H. L., W. Wang, M. Pajic, and P. Xu. “Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning.” In Advances in Neural Information Processing Systems, Vol. 37, 2024.
Hsu HL, Wang W, Pajic M, Xu P. Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning. In: Advances in Neural Information Processing Systems. 2024.
Hsu, H. L., et al. “Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning.” Advances in Neural Information Processing Systems, vol. 37, 2024.
Hsu HL, Wang W, Pajic M, Xu P. Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning. Advances in Neural Information Processing Systems. 2024.

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2024

Volume

37

Related Subject Headings

  • 4611 Machine learning
  • 1702 Cognitive Sciences
  • 1701 Psychology