Skip to main content

SAMPLE EFFICIENT POLICY GRADIENT METHODS WITH RECURSIVE VARIANCE REDUCTION

Publication ,  Conference
Xu, P; Gao, F; Gu, Q
Published in: 8th International Conference on Learning Representations, ICLR 2020
January 1, 2020

Improving the sample efficiency in reinforcement learning has been a longstanding research problem. In this work, we aim to reduce the sample complexity of existing policy gradient methods. We propose a novel policy gradient algorithm called SRVR-PG, which only requires O(1/∊3/2)1 episodes to find an ∊- approximate stationary point of the nonconcave performance function J(θ) (i.e., θ such that ∥∇J(θ)∥22 ≤ ∊). This sample complexity improves the existing result O(1/∊5/3) for stochastic variance reduced policy gradient algorithms by a factor of O(1/∊1/6). In addition, we also propose a variant of SRVR-PG with parameter exploration, which explores the initial policy parameter from a prior probability distribution. We conduct numerical experiments on classic control problems in reinforcement learning to validate the performance of our proposed algorithms.

Duke Scholars

Published In

8th International Conference on Learning Representations, ICLR 2020

Publication Date

January 1, 2020
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xu, P., Gao, F., & Gu, Q. (2020). SAMPLE EFFICIENT POLICY GRADIENT METHODS WITH RECURSIVE VARIANCE REDUCTION. In 8th International Conference on Learning Representations, ICLR 2020.
Xu, P., F. Gao, and Q. Gu. “SAMPLE EFFICIENT POLICY GRADIENT METHODS WITH RECURSIVE VARIANCE REDUCTION.” In 8th International Conference on Learning Representations, ICLR 2020, 2020.
Xu P, Gao F, Gu Q. SAMPLE EFFICIENT POLICY GRADIENT METHODS WITH RECURSIVE VARIANCE REDUCTION. In: 8th International Conference on Learning Representations, ICLR 2020. 2020.
Xu, P., et al. “SAMPLE EFFICIENT POLICY GRADIENT METHODS WITH RECURSIVE VARIANCE REDUCTION.” 8th International Conference on Learning Representations, ICLR 2020, 2020.
Xu P, Gao F, Gu Q. SAMPLE EFFICIENT POLICY GRADIENT METHODS WITH RECURSIVE VARIANCE REDUCTION. 8th International Conference on Learning Representations, ICLR 2020. 2020.

Published In

8th International Conference on Learning Representations, ICLR 2020

Publication Date

January 1, 2020