Scholars@Duke publication: The infinite regionalized policy representation

The infinite regionalized policy representation

Publication , Journal Article

Liu, M; Liao, X; Carin, L

Published in: Proceedings of the 28th International Conference on Machine Learning, ICML 2011

October 7, 2011

We introduce the infinite regionalized policy presentation (iRPR), as a nonparametric policy for reinforcement learning in partially observable Markov decision processes (POMDPs). The iRPR assumes an unbounded set of decision states a priori, and infers the number of states to represent the policy given the experiences. We propose algorithms for learning the number of decision states while maintaining a proper balance between exploration and exploitation. Convergence analysis is provided, along with performance evaluations on benchmark problems. Copyright 2011 by the author(s)/owner(s).

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

Proceedings of the 28th International Conference on Machine Learning, ICML 2011

Publication Date

October 7, 2011

Start / End Page

769 / 776

Citation

APA

Chicago

ICMJE

MLA

NLM

Liu, M., Liao, X., & Carin, L. (2011). The infinite regionalized policy representation. Proceedings of the 28th International Conference on Machine Learning, ICML 2011, 769–776.

Liu, M., X. Liao, and L. Carin. “The infinite regionalized policy representation.” Proceedings of the 28th International Conference on Machine Learning, ICML 2011, October 7, 2011, 769–76.

Liu M, Liao X, Carin L. The infinite regionalized policy representation. Proceedings of the 28th International Conference on Machine Learning, ICML 2011. 2011 Oct 7;769–76.

Liu, M., et al. “The infinite regionalized policy representation.” Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Oct. 2011, pp. 769–76.

Liu M, Liao X, Carin L. The infinite regionalized policy representation. Proceedings of the 28th International Conference on Machine Learning, ICML 2011. 2011 Oct 7;769–776.

Published In

Proceedings of the 28th International Conference on Machine Learning, ICML 2011

Publication Date

October 7, 2011

Start / End Page

769 / 776