Scholars@Duke publication: Multi-task reinforcement learning in partially observable stochastic environments

Multi-task reinforcement learning in partially observable stochastic environments

Publication , Journal Article

Li, H; Liao, X; Carin, L

Published in: Journal of Machine Learning Research

January 1, 2009

We consider the problem of multi-task reinforcement learning (MTRL) in multiple partially observable stochastic environments. We introduce the regionalized policy representation (RPR) to characterize the agent's behavior in each environment. The RPR is a parametric model of the conditional distribution over current actions given the history of past actions and observations; the agent's choice of actions is directly based on this conditional distribution, without an intervening model to characterize the environment itself. We propose off-policy batch algorithms to learn the parameters of the RPRs, using episodic data collected when following a behavior policy, and show their linkage to policy iteration. We employ the Dirichlet process as a nonparametric prior over the RPRs across multiple environments. The intrinsic clustering property of the Dirichlet process imposes sharing of episodes among similar environments, which effectively reduces the number of episodes required for learning a good policy in each environment, when data sharing is appropriate. The number of distinct RPRs and the associated clusters (the sharing patterns) are automatically discovered by exploiting the episodic data as well as the nonparametric nature of the Dirichlet process. We demonstrate the effectiveness of the proposed RPR as well as the RPR-based MTRL framework on various problems, including grid-world navigation and multi-aspect target classification. The experimental results show that the RPR is a competitive reinforcement learning algorithm in partially observable domains, and the MTRL consistently achieves better performance than single task reinforcement learning. © 2009 Hui Li, Xuejun Liao and Lawrence Carin.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

Journal of Machine Learning Research

EISSN

1533-7928

ISSN

1532-4435

Publication Date

January 1, 2009

Volume

Start / End Page

1131 / 1186

Related Subject Headings

Artificial Intelligence & Image Processing
4905 Statistics
4611 Machine learning
17 Psychology and Cognitive Sciences
08 Information and Computing Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Li, H., Liao, X., & Carin, L. (2009). Multi-task reinforcement learning in partially observable stochastic environments. Journal of Machine Learning Research, 10, 1131–1186.

Li, H., X. Liao, and L. Carin. “Multi-task reinforcement learning in partially observable stochastic environments.” Journal of Machine Learning Research 10 (January 1, 2009): 1131–86.

Li H, Liao X, Carin L. Multi-task reinforcement learning in partially observable stochastic environments. Journal of Machine Learning Research. 2009 Jan 1;10:1131–86.

Li, H., et al. “Multi-task reinforcement learning in partially observable stochastic environments.” Journal of Machine Learning Research, vol. 10, Jan. 2009, pp. 1131–86.

Li H, Liao X, Carin L. Multi-task reinforcement learning in partially observable stochastic environments. Journal of Machine Learning Research. 2009 Jan 1;10:1131–1186.

Published In

Journal of Machine Learning Research

EISSN

1533-7928

ISSN

1532-4435

Publication Date

January 1, 2009

Volume

Start / End Page

1131 / 1186

Related Subject Headings

Artificial Intelligence & Image Processing
4905 Statistics
4611 Machine learning
17 Psychology and Cognitive Sciences
08 Information and Computing Sciences