Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions.

Journal Article (Journal Article)

A dynamic treatment regime is a list of sequential decision rules for assigning treatment based on a patient's history. Q- and A-learning are two main approaches for estimating the optimal regime, i.e., that yielding the most beneficial outcome in the patient population, using data from a clinical trial or observational study. Q-learning requires postulated regression models for the outcome, while A-learning involves models for that part of the outcome regression representing treatment contrasts and for treatment assignment. We propose an alternative to Q- and A-learning that maximizes a doubly robust augmented inverse probability weighted estimator for population mean outcome over a restricted class of regimes. Simulations demonstrate the method's performance and robustness to model misspecification, which is a key concern.

Full Text

Duke Authors

Cited Authors

  • Zhang, B; Tsiatis, AA; Laber, EB; Davidian, M

Published Date

  • 2013

Published In

Volume / Issue

  • 100 / 3

PubMed ID

  • 24302771

Pubmed Central ID

  • PMC3843953

International Standard Serial Number (ISSN)

  • 0006-3444

Digital Object Identifier (DOI)

  • 10.1093/biomet/ast014


  • eng

Conference Location

  • England