Scholars@Duke publication: Off-policy reinforcement learning with Gaussian processes

Off-policy reinforcement learning with Gaussian processes

Publication , Journal Article

Chowdhary, G; Liu, M; Grande, R; Walsh, T; How, J; Carin, L

Published in: IEEE/CAA Journal of Automatica Sinica

July 1, 2014

An off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

IEEE/CAA Journal of Automatica Sinica

DOI

10.1109/JAS.2014.7004680

EISSN

2329-9274

ISSN

2329-9266

Publication Date

July 1, 2014

Volume

Issue

Start / End Page

227 / 238

Related Subject Headings

4007 Control engineering, mechatronics and robotics

Citation

APA

Chicago

ICMJE

MLA

NLM

Chowdhary, G., Liu, M., Grande, R., Walsh, T., How, J., & Carin, L. (2014). Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica, 1(3), 227–238. https://doi.org/10.1109/JAS.2014.7004680

Chowdhary, G., M. Liu, R. Grande, T. Walsh, J. How, and L. Carin. “Off-policy reinforcement learning with Gaussian processes.” IEEE/CAA Journal of Automatica Sinica 1, no. 3 (July 1, 2014): 227–38. https://doi.org/10.1109/JAS.2014.7004680.

Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica. 2014 Jul 1;1(3):227–38.

Chowdhary, G., et al. “Off-policy reinforcement learning with Gaussian processes.” IEEE/CAA Journal of Automatica Sinica, vol. 1, no. 3, July 2014, pp. 227–38. Scopus, doi:10.1109/JAS.2014.7004680.

Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA Journal of Automatica Sinica. 2014 Jul 1;1(3):227–238.

Published In

IEEE/CAA Journal of Automatica Sinica

DOI

10.1109/JAS.2014.7004680

EISSN

2329-9274

ISSN

2329-9266

Publication Date

July 1, 2014

Volume

Issue

Start / End Page

227 / 238

Related Subject Headings

4007 Control engineering, mechatronics and robotics