Skip to main content
Journal cover image

Adjusting for selection bias due to missing data in electronic health records-based research.

Publication ,  Journal Article
Peskoe, SB; Arterburn, D; Coleman, KJ; Herrinton, LJ; Daniels, MJ; Haneuse, S
Published in: Stat Methods Med Res
October 2021

While electronic health records data provide unique opportunities for research, numerous methodological issues must be considered. Among these, selection bias due to incomplete/missing data has received far less attention than other issues. Unfortunately, standard missing data approaches (e.g. inverse-probability weighting and multiple imputation) generally fail to acknowledge the complex interplay of heterogeneous decisions made by patients, providers, and health systems that govern whether specific data elements in the electronic health records are observed. This, in turn, renders the missing-at-random assumption difficult to believe in standard approaches. In the clinical literature, the collection of decisions that gives rise to the observed data is referred to as the data provenance. Building on a recently-proposed framework for modularizing the data provenance, we develop a general and scalable framework for estimation and inference with respect to regression models based on inverse-probability weighting that allows for a hierarchy of missingness mechanisms to better align with the complex nature of electronic health records data. We show that the proposed estimator is consistent and asymptotically Normal, derive the form of the asymptotic variance, and propose two consistent estimators. Simulations show that naïve application of standard methods may yield biased point estimates, that the proposed estimators have good small-sample properties, and that researchers may have to contend with a bias-variance trade-off as they consider how to handle missing data. The proposed methods are motivated by an on-going, electronic health records-based study of bariatric surgery.

Duke Scholars

Published In

Stat Methods Med Res

DOI

EISSN

1477-0334

Publication Date

October 2021

Volume

30

Issue

10

Start / End Page

2221 / 2238

Location

England

Related Subject Headings

  • Statistics & Probability
  • Selection Bias
  • Probability
  • Humans
  • Electronic Health Records
  • Bias
  • 4905 Statistics
  • 4202 Epidemiology
  • 1117 Public Health and Health Services
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Peskoe, S. B., Arterburn, D., Coleman, K. J., Herrinton, L. J., Daniels, M. J., & Haneuse, S. (2021). Adjusting for selection bias due to missing data in electronic health records-based research. Stat Methods Med Res, 30(10), 2221–2238. https://doi.org/10.1177/09622802211027601
Peskoe, Sarah B., David Arterburn, Karen J. Coleman, Lisa J. Herrinton, Michael J. Daniels, and Sebastien Haneuse. “Adjusting for selection bias due to missing data in electronic health records-based research.Stat Methods Med Res 30, no. 10 (October 2021): 2221–38. https://doi.org/10.1177/09622802211027601.
Peskoe SB, Arterburn D, Coleman KJ, Herrinton LJ, Daniels MJ, Haneuse S. Adjusting for selection bias due to missing data in electronic health records-based research. Stat Methods Med Res. 2021 Oct;30(10):2221–38.
Peskoe, Sarah B., et al. “Adjusting for selection bias due to missing data in electronic health records-based research.Stat Methods Med Res, vol. 30, no. 10, Oct. 2021, pp. 2221–38. Pubmed, doi:10.1177/09622802211027601.
Peskoe SB, Arterburn D, Coleman KJ, Herrinton LJ, Daniels MJ, Haneuse S. Adjusting for selection bias due to missing data in electronic health records-based research. Stat Methods Med Res. 2021 Oct;30(10):2221–2238.
Journal cover image

Published In

Stat Methods Med Res

DOI

EISSN

1477-0334

Publication Date

October 2021

Volume

30

Issue

10

Start / End Page

2221 / 2238

Location

England

Related Subject Headings

  • Statistics & Probability
  • Selection Bias
  • Probability
  • Humans
  • Electronic Health Records
  • Bias
  • 4905 Statistics
  • 4202 Epidemiology
  • 1117 Public Health and Health Services
  • 0104 Statistics