On incomplete learning and certainty-equivalence control

Published

Journal Article

© 2018, INFORMS. We consider a dynamic learning problem where a decision maker sequentially selects a control and observes a response variable that depends on chosen control and an unknown sensitivity parameter. After every observation, the decision maker updates his or her estimate of the unknown parameter and uses a certainty-equivalence decision rule to determine subsequent controls based on this estimate. We show that under this certainty-equivalence learning policy the parameter estimates converge with positive probability to an uninformative fixed point that can differ from the true value of the unknown parameter; a phenomenon that will be referred to as incomplete learning. In stark contrast, it will be shown that this certainty-equivalence policy may avoid incomplete learning if the parameter value of interest "drifts away" from the uninformative fixed point at a critical rate. Finally, we prove that one can adaptively limit the learning memory to improve the accuracy of the certainty-equivalence policy in both static (estimation), as well as slowly varying (tracking) environments, without relying on forced exploration.

Full Text

Duke Authors

Cited Authors

  • Bora Keskin, N; Zeevi, A

Published Date

  • July 1, 2018

Published In

Volume / Issue

  • 66 / 4

Start / End Page

  • 1136 - 1167

Electronic International Standard Serial Number (EISSN)

  • 1526-5463

International Standard Serial Number (ISSN)

  • 0030-364X

Digital Object Identifier (DOI)

  • 10.1287/opre.2017.1713

Citation Source

  • Scopus