Skip to main content

Off-policy reinforcement learning with Gaussian processes

Publication ,  Journal Article
Chowdhary, G; Liu, M; Grande, R; Walsh, T; How, J; Carin, L
Published in: IEEE/CAA Journal of Automatica Sinica
July 1, 2014

An off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations.

Duke Scholars

Published In

IEEE/CAA Journal of Automatica Sinica

DOI

EISSN

2329-9274

ISSN

2329-9266

Publication Date

July 1, 2014

Volume

1

Issue

3

Start / End Page

227 / 238

Related Subject Headings

  • 4007 Control engineering, mechatronics and robotics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Chowdhary, G., Liu, M., Grande, R., Walsh, T., How, J., & Carin, L. (2014). Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica, 1(3), 227–238. https://doi.org/10.1109/JAS.2014.7004680
Chowdhary, G., M. Liu, R. Grande, T. Walsh, J. How, and L. Carin. “Off-policy reinforcement learning with Gaussian processes.” IEEE/CAA Journal of Automatica Sinica 1, no. 3 (July 1, 2014): 227–38. https://doi.org/10.1109/JAS.2014.7004680.
Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica. 2014 Jul 1;1(3):227–38.
Chowdhary, G., et al. “Off-policy reinforcement learning with Gaussian processes.” IEEE/CAA Journal of Automatica Sinica, vol. 1, no. 3, July 2014, pp. 227–38. Scopus, doi:10.1109/JAS.2014.7004680.
Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica. 2014 Jul 1;1(3):227–238.

Published In

IEEE/CAA Journal of Automatica Sinica

DOI

EISSN

2329-9274

ISSN

2329-9266

Publication Date

July 1, 2014

Volume

1

Issue

3

Start / End Page

227 / 238

Related Subject Headings

  • 4007 Control engineering, mechatronics and robotics