Scholars@Duke publication: A variance analysis for POMDP policy evaluation

A variance analysis for POMDP policy evaluation

Publication , Journal Article

Fard, MM; Pineau, J; Sun, P

Published in: Proceedings of the National Conference on Artificial Intelligence

December 24, 2008

Partially Observable Markov Decision Processes have been studied widely as a model for decision making under uncertainty, and a number of methods have been developed to find the solutions for such processes. Such studies often involve calculation of the value function of a specific policy, given a model of the transition and observation probabilities, and the reward. These models can be learned using labeled samples of on-policy trajectories. However, when using empirical models, some bias and variance terms are introduced into the value function as a result of imperfect models. In this paper, we propose a method for estimating the bias and variance of the value function in terms of the statistics of the empirical transition and observation model. Such error terms can be used to meaningfully compare the value of different policies. This is an important result for sequential decision-making, since it will allow us to provide more formal guarantees about the quality of the policies we implement. To evaluate the precision of the proposed method, we provide supporting experiments on problems from the field of robotics and medical decision making. Copyright © 2008, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Duke Scholars

Author Peng Sun Fuqua School of Business

Published In

Proceedings of the National Conference on Artificial Intelligence

Publication Date

December 24, 2008

Volume

Start / End Page

1056 / 1061

Citation

APA

Chicago

ICMJE

MLA

NLM

Fard, M. M., Pineau, J., & Sun, P. (2008). A variance analysis for POMDP policy evaluation. Proceedings of the National Conference on Artificial Intelligence, 2, 1056–1061.

Fard, M. M., J. Pineau, and P. Sun. “A variance analysis for POMDP policy evaluation.” Proceedings of the National Conference on Artificial Intelligence 2 (December 24, 2008): 1056–61.

Fard MM, Pineau J, Sun P. A variance analysis for POMDP policy evaluation. Proceedings of the National Conference on Artificial Intelligence. 2008 Dec 24;2:1056–61.

Fard, M. M., et al. “A variance analysis for POMDP policy evaluation.” Proceedings of the National Conference on Artificial Intelligence, vol. 2, Dec. 2008, pp. 1056–61.

Fard MM, Pineau J, Sun P. A variance analysis for POMDP policy evaluation. Proceedings of the National Conference on Artificial Intelligence. 2008 Dec 24;2:1056–1061.

Published In

Proceedings of the National Conference on Artificial Intelligence

Publication Date

December 24, 2008

Volume

Start / End Page

1056 / 1061