Reinforcement learning via kernel temporal difference.
Conference Paper
This paper introduces a kernel adaptive filter implemented with stochastic gradient on temporal differences, kernel Temporal Difference (TD)(λ), to estimate the state-action value function in reinforcement learning. The case λ=0 will be studied in this paper. Experimental results show the method's applicability for learning motor state decoding during a center-out reaching task performed by a monkey. The results are compared to the implementation of a time delay neural network (TDNN) trained with backpropagation of the temporal difference error. From the experiments, it is observed that kernel TD(0) allows faster convergence and a better solution than the neural network.
Full Text
Duke Authors
Cited Authors
- Bae, J; Chhatbar, P; Francis, JT; Sanchez, JC; Principe, JC
Published Date
- 2011
Published In
- Annu Int Conf Ieee Eng Med Biol Soc
Volume / Issue
- 2011 /
Start / End Page
- 5662 - 5665
PubMed ID
- 22255624
Electronic International Standard Serial Number (EISSN)
- 2694-0604
Digital Object Identifier (DOI)
- 10.1109/IEMBS.2011.6091370
Conference Location
- United States