Skip to main content

A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records.

Publication ,  Journal Article
Weberpals, J; Raman, SR; Shaw, PA; Lee, H; Russo, M; Hammill, BG; Toh, S; Connolly, JG; Dandreo, KJ; Tian, F; Liu, W; Li, J; Glynn, RJ ...
Published in: Clin Epidemiol
2024

OBJECTIVE: Partially observed confounder data pose challenges to the statistical analysis of electronic health records (EHR) and systematic assessments of potentially underlying missingness mechanisms are lacking. We aimed to provide a principled approach to empirically characterize missing data processes and investigate performance of analytic methods. METHODS: Three empirical sub-cohorts of diabetic SGLT2 or DPP4-inhibitor initiators with complete information on HbA1c, BMI and smoking as confounders of interest (COI) formed the basis of data simulation under a plasmode framework. A true null treatment effect, including the COI in the outcome generation model, and four missingness mechanisms for the COI were simulated: completely at random (MCAR), at random (MAR), and two not at random (MNAR) mechanisms, where missingness was dependent on an unmeasured confounder and on the value of the COI itself. We evaluated the ability of three groups of diagnostics to differentiate between mechanisms: 1)-differences in characteristics between patients with or without the observed COI (using averaged standardized mean differences [ASMD]), 2)-predictive ability of the missingness indicator based on observed covariates, and 3)-association of the missingness indicator with the outcome. We then compared analytic methods including "complete case", inverse probability weighting, single and multiple imputation in their ability to recover true treatment effects. RESULTS: The diagnostics successfully identified characteristic patterns of simulated missingness mechanisms. For MAR, but not MCAR, the patient characteristics showed substantial differences (median ASMD 0.20 vs 0.05) and consequently, discrimination of the prediction models for missingness was also higher (0.59 vs 0.50). For MNAR, but not MAR or MCAR, missingness was significantly associated with the outcome even in models adjusting for other observed covariates. Comparing analytic methods, multiple imputation using a random forest algorithm resulted in the lowest root-mean-squared-error. CONCLUSION: Principled diagnostics provided reliable insights into missingness mechanisms. When assumptions allow, multiple imputation with nonparametric models could help reduce bias.

Duke Scholars

Published In

Clin Epidemiol

DOI

ISSN

1179-1349

Publication Date

2024

Volume

16

Start / End Page

329 / 343

Location

New Zealand

Related Subject Headings

  • 4206 Public health
  • 4202 Epidemiology
  • 1117 Public Health and Health Services
  • 1103 Clinical Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Weberpals, J., Raman, S. R., Shaw, P. A., Lee, H., Russo, M., Hammill, B. G., … Desai, R. J. (2024). A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records. Clin Epidemiol, 16, 329–343. https://doi.org/10.2147/CLEP.S436131
Weberpals, Janick, Sudha R. Raman, Pamela A. Shaw, Hana Lee, Massimiliano Russo, Bradley G. Hammill, Sengwee Toh, et al. “A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records.Clin Epidemiol 16 (2024): 329–43. https://doi.org/10.2147/CLEP.S436131.
Weberpals J, Raman SR, Shaw PA, Lee H, Russo M, Hammill BG, et al. A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records. Clin Epidemiol. 2024;16:329–43.
Weberpals, Janick, et al. “A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records.Clin Epidemiol, vol. 16, 2024, pp. 329–43. Pubmed, doi:10.2147/CLEP.S436131.
Weberpals J, Raman SR, Shaw PA, Lee H, Russo M, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records. Clin Epidemiol. 2024;16:329–343.

Published In

Clin Epidemiol

DOI

ISSN

1179-1349

Publication Date

2024

Volume

16

Start / End Page

329 / 343

Location

New Zealand

Related Subject Headings

  • 4206 Public health
  • 4202 Epidemiology
  • 1117 Public Health and Health Services
  • 1103 Clinical Sciences