Scholars@Duke publication: Stochastic kernel temporal difference for reinforcement learning

Stochastic kernel temporal difference for reinforcement learning

Publication , Conference

Bae, J; Giraldo, LS; Chhatbar, P; Francis, J; Sanchez, J; Principe, J

Published in: IEEE International Workshop on Machine Learning for Signal Processing

December 5, 2011

This paper introduces a kernel adaptive filter using the stochastic gradient on temporal differences, kernel TD(λ), to estimate the state-action value function Q in reinforcement learning. Kernel methods are powerful for solving nonlinear problems, but the growing computational complexity and memory size limit their applicability on practical scenarios. To overcome this, the quantization approach introduced in [1] is applied. To help understand the behavior and illustrate the role of the parameters, we apply the algorithm on a 2-dimentional spatial navigation task. Eligibility traces are commonly applied in TD learning to improve data efficiency, so the relations of eligibility trace λ and step size and filter size are observed. Moreover, kernel TD (0) is applied to neural decoding of an 8 target center-out reaching task performed by a monkey. Results show the method can effectively learn the brain-state action mapping for this task. © 2011 IEEE.

Duke Scholars

Author Pratik Yashvant Chhatbar Neurology, Stroke and Vascular Neurology

Published In

IEEE International Workshop on Machine Learning for Signal Processing

DOI

10.1109/MLSP.2011.6064634

ISBN

9781457716232

Publication Date

December 5, 2011

Citation

APA

Chicago

ICMJE

MLA

NLM

Bae, J., Giraldo, L. S., Chhatbar, P., Francis, J., Sanchez, J., & Principe, J. (2011). Stochastic kernel temporal difference for reinforcement learning. In IEEE International Workshop on Machine Learning for Signal Processing. https://doi.org/10.1109/MLSP.2011.6064634

Bae, J., L. S. Giraldo, P. Chhatbar, J. Francis, J. Sanchez, and J. Principe. “Stochastic kernel temporal difference for reinforcement learning.” In IEEE International Workshop on Machine Learning for Signal Processing, 2011. https://doi.org/10.1109/MLSP.2011.6064634.

Bae J, Giraldo LS, Chhatbar P, Francis J, Sanchez J, Principe J. Stochastic kernel temporal difference for reinforcement learning. In: IEEE International Workshop on Machine Learning for Signal Processing. 2011.

Bae, J., et al. “Stochastic kernel temporal difference for reinforcement learning.” IEEE International Workshop on Machine Learning for Signal Processing, 2011. Scopus, doi:10.1109/MLSP.2011.6064634.

Bae J, Giraldo LS, Chhatbar P, Francis J, Sanchez J, Principe J. Stochastic kernel temporal difference for reinforcement learning. IEEE International Workshop on Machine Learning for Signal Processing. 2011.

Published In

IEEE International Workshop on Machine Learning for Signal Processing

DOI

10.1109/MLSP.2011.6064634

ISBN

9781457716232

Publication Date

December 5, 2011