Scholars@Duke publication: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator

Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator

Publication , Conference

Fazel, M; Ge, R; Kakade, SM; Mesbahi, M

Published in: 35th International Conference on Machine Learning, ICML 2018

January 1, 2018

Direct policy gradient methods for reinforcement learning and continuous control problems arc a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model, 2) they are an "end- to-end" approach, directly optimizing the performance metric of interest, 3) they inherently allow for richly parameterized policies. A notable drawback is that even in the most basic continuous control problem (that of linear quadratic regulators), these methods must solve a non-convex optimization problem, where little is understood about their efficiency from both computational and statistical perspectives. In contrast, system identification and model based planning in opti- : Mal control theory have a much more solid theo- ! retical footing, where much is known with regards to their computational and statistical properties. , This work bridges this gap showing that (model ; free) policy gradient methods globally converge to the optimal solution and are efficient (polynomi- ' ally so in relevant problem dependent quantities) : With regards to their sample and computational complexities.

Duke Scholars

Author Rong Ge Computer Science

Published In

35th International Conference on Machine Learning, ICML 2018

ISBN

9781510867963

Publication Date

January 1, 2018

Volume

Start / End Page

2385 / 2413

Citation

APA

Chicago

ICMJE

MLA

NLM

Fazel, M., Ge, R., Kakade, S. M., & Mesbahi, M. (2018). Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator. In 35th International Conference on Machine Learning, ICML 2018 (Vol. 4, pp. 2385–2413).

Fazel, M., R. Ge, S. M. Kakade, and M. Mesbahi. “Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator.” In 35th International Conference on Machine Learning, ICML 2018, 4:2385–2413, 2018.

Fazel M, Ge R, Kakade SM, Mesbahi M. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator. In: 35th International Conference on Machine Learning, ICML 2018. 2018. p. 2385–413.

Fazel, M., et al. “Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator.” 35th International Conference on Machine Learning, ICML 2018, vol. 4, 2018, pp. 2385–413.

Fazel M, Ge R, Kakade SM, Mesbahi M. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator. 35th International Conference on Machine Learning, ICML 2018. 2018. p. 2385–2413.

Published In

35th International Conference on Machine Learning, ICML 2018

ISBN

9781510867963

Publication Date

January 1, 2018

Volume

Start / End Page

2385 / 2413