Skip to main content

Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning

Publication ,  Conference
Zhang, Y; Qu, G; Xu, P; Lin, Y; Chen, Z; Wierman, A
Published in: SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
June 19, 2023

We study a multi-Agent reinforcement learning (MARL) problem where the agents interact over a given network. The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-Term rewards. To overcome the curse of dimensionality and to reduce communication, we propose a Localized Policy Iteration (LPI) algorithm that provably learns a near-globally-optimal policy using only local information. In particular, we show that, despite restricting each agent's attention to only its κ-hop neighborhood, the agents are able to learn a policy with an optimality gap that decays polynomially in κ. In addition, we show the finite-sample convergence of LPI to the global optimal policy, which explicitly captures the trade-off between optimality and computational complexity in choosing κ. Numerical simulations demonstrate the effectiveness of LPI. This extended abstract is an abridged version of [12].

Duke Scholars

Published In

SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

DOI

Publication Date

June 19, 2023

Start / End Page

83 / 84
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhang, Y., Qu, G., Xu, P., Lin, Y., Chen, Z., & Wierman, A. (2023). Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning. In SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (pp. 83–84). https://doi.org/10.1145/3578338.3593545
Zhang, Y., G. Qu, P. Xu, Y. Lin, Z. Chen, and A. Wierman. “Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning.” In SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 83–84, 2023. https://doi.org/10.1145/3578338.3593545.
Zhang Y, Qu G, Xu P, Lin Y, Chen Z, Wierman A. Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning. In: SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. 2023. p. 83–4.
Zhang, Y., et al. “Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning.” SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2023, pp. 83–84. Scopus, doi:10.1145/3578338.3593545.
Zhang Y, Qu G, Xu P, Lin Y, Chen Z, Wierman A. Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning. SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. 2023. p. 83–84.

Published In

SIGMETRICS 2023 - Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

DOI

Publication Date

June 19, 2023

Start / End Page

83 / 84