Robust and efficient transfer learning with hidden parameter Markov decision processes
Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. An intriguing application of transfer learning emerges when tasks arise with similar, but not identical, dynamics. Hidden Parameter Markov Decision Processes (HiP-MDP) embed these tasks into a low-dimensional space; given the embedding parameters one can identify the MDP for a particular task. However, the original formulation of HiP-MDP had a critical flaw: the embedding uncertainty was modeled independently of the agent's state uncertainty, requiring an arduous training procedure. In this work, we apply a Gaussian Process latent variable model to jointly model the dynamics and the embedding, leading to a more elegant formulation, one that allows for better uncertainty quantification and thus more robust transfer.
Killian, TW; Konidaris, G; Doshi-Velez, F
31st AAAI Conference on Artificial Intelligence, AAAI 2017
Start / End Page