Off-policy reinforcement learning with Gaussian processes
Publication
, Journal Article
Chowdhary, G; Liu, M; Grande, R; Walsh, T; How, J; Carin, L
Published in: IEEE/CAA Journal of Automatica Sinica
July 1, 2014
An off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations.
Duke Scholars
Published In
IEEE/CAA Journal of Automatica Sinica
DOI
EISSN
2329-9274
ISSN
2329-9266
Publication Date
July 1, 2014
Volume
1
Issue
3
Start / End Page
227 / 238
Related Subject Headings
- 4007 Control engineering, mechatronics and robotics
Citation
APA
Chicago
ICMJE
MLA
NLM
Chowdhary, G., Liu, M., Grande, R., Walsh, T., How, J., & Carin, L. (2014). Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica, 1(3), 227–238. https://doi.org/10.1109/JAS.2014.7004680
Chowdhary, G., M. Liu, R. Grande, T. Walsh, J. How, and L. Carin. “Off-policy reinforcement learning with Gaussian processes.” IEEE/CAA Journal of Automatica Sinica 1, no. 3 (July 1, 2014): 227–38. https://doi.org/10.1109/JAS.2014.7004680.
Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica. 2014 Jul 1;1(3):227–38.
Chowdhary, G., et al. “Off-policy reinforcement learning with Gaussian processes.” IEEE/CAA Journal of Automatica Sinica, vol. 1, no. 3, July 2014, pp. 227–38. Scopus, doi:10.1109/JAS.2014.7004680.
Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica. 2014 Jul 1;1(3):227–238.
Published In
IEEE/CAA Journal of Automatica Sinica
DOI
EISSN
2329-9274
ISSN
2329-9266
Publication Date
July 1, 2014
Volume
1
Issue
3
Start / End Page
227 / 238
Related Subject Headings
- 4007 Control engineering, mechatronics and robotics