Skip to main content

A finite-time analysis of Q-Learning with neural network function approximation

Publication ,  Conference
Xu, P; Gu, Q
Published in: 37th International Conference on Machine Learning, ICML 2020
January 1, 2020

Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning remains virtually unknown. In this paper, we present a finite-time analysis of a neural Q-learning algorithm, where the data are generated from a Markov decision process, and the action-value function is approximated by a deep ReLU neural network. We prove that neural Q-learning finds the optimal policy with O(1/√T) convergence rate if the neural function approximator is sufficiently overparameterized, where T is the number of iterations. To our best knowledge, our result is the first finite-time analysis of neural Q-learning under non-i.i.d. data assumption.

Duke Scholars

Published In

37th International Conference on Machine Learning, ICML 2020

Publication Date

January 1, 2020

Volume

PartF168147-14

Start / End Page

10486 / 10496
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xu, P., & Gu, Q. (2020). A finite-time analysis of Q-Learning with neural network function approximation. In 37th International Conference on Machine Learning, ICML 2020 (Vol. PartF168147-14, pp. 10486–10496).
Xu, P., and Q. Gu. “A finite-time analysis of Q-Learning with neural network function approximation.” In 37th International Conference on Machine Learning, ICML 2020, PartF168147-14:10486–96, 2020.
Xu P, Gu Q. A finite-time analysis of Q-Learning with neural network function approximation. In: 37th International Conference on Machine Learning, ICML 2020. 2020. p. 10486–96.
Xu, P., and Q. Gu. “A finite-time analysis of Q-Learning with neural network function approximation.” 37th International Conference on Machine Learning, ICML 2020, vol. PartF168147-14, 2020, pp. 10486–96.
Xu P, Gu Q. A finite-time analysis of Q-Learning with neural network function approximation. 37th International Conference on Machine Learning, ICML 2020. 2020. p. 10486–10496.

Published In

37th International Conference on Machine Learning, ICML 2020

Publication Date

January 1, 2020

Volume

PartF168147-14

Start / End Page

10486 / 10496