Scholars@Duke publication: Stick-breaking policy learning in Dec-POMDPs

Stick-breaking policy learning in Dec-POMDPs

Publication , Conference

Liu, M; Amato, C; Liao, X; Carin, L; How, JP

Published in: Ijcai International Joint Conference on Artificial Intelligence

January 1, 2015

Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from the optimal value. This paper represents the local policy of each agent using variable-sized FSCs that are constructed using a stick-breaking prior, leading to a new framework called decentralized stick-breaking policy representation (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the Dec-POMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

Ijcai International Joint Conference on Artificial Intelligence

ISSN

1045-0823

Publication Date

January 1, 2015

Volume

2015-January

Start / End Page

2011 / 2018

Citation

APA

Chicago

ICMJE

MLA

NLM

Liu, M., Amato, C., Liao, X., Carin, L., & How, J. P. (2015). Stick-breaking policy learning in Dec-POMDPs. In Ijcai International Joint Conference on Artificial Intelligence (Vol. 2015-January, pp. 2011–2018).

Liu, M., C. Amato, X. Liao, L. Carin, and J. P. How. “Stick-breaking policy learning in Dec-POMDPs.” In Ijcai International Joint Conference on Artificial Intelligence, 2015-January:2011–18, 2015.

Liu M, Amato C, Liao X, Carin L, How JP. Stick-breaking policy learning in Dec-POMDPs. In: Ijcai International Joint Conference on Artificial Intelligence. 2015. p. 2011–8.

Liu, M., et al. “Stick-breaking policy learning in Dec-POMDPs.” Ijcai International Joint Conference on Artificial Intelligence, vol. 2015-January, 2015, pp. 2011–18.

Liu M, Amato C, Liao X, Carin L, How JP. Stick-breaking policy learning in Dec-POMDPs. Ijcai International Joint Conference on Artificial Intelligence. 2015. p. 2011–2018.

Published In

Ijcai International Joint Conference on Artificial Intelligence

ISSN

1045-0823

Publication Date

January 1, 2015

Volume

2015-January

Start / End Page

2011 / 2018