Skip to main content
construction release_alert
The Scholars Team is working with OIT to resolve some issues with the Scholars search index
cancel
Journal cover image

Analyzing missingness patterns in real-world data using the SMDI toolkit: application to a linked EHR-claims pharmacoepidemiology study.

Publication ,  Journal Article
Raman, SR; Hammill, BG; Shaw, PA; Lee, H; Toh, S; Connolly, JG; Dandreo, KJ; Nalawade, V; Tian, F; Liu, W; Li, J; Hernández-Muñoz, JJ ...
Published in: BMC Med Res Methodol
October 19, 2024

BACKGROUND: Missing data in confounding variables present a frequent challenge in generating evidence using real-world data, including electronic health records (EHR). Our objective was to apply a recently published toolkit for characterizing missing data patterns and based on the toolkit results about likely missingness mechanisms, illustrate the decision-making process for analyses in an empirical case example. METHODS: We utilized the Structural Missing Data Investigations (SMDI) toolkit to characterize missing data patterns in the context of a pharmacoepidemiology study comparing cardiovascular outcomes of initiating sodium-glucose-cotransporter-2 inhibitors (SGLT2i) and dipeptidyl peptidase-4 inhibitors (DPP-4i) among older adults. The study used a linked EHR-Medicare claims dataset from Duke Health patients (2015-2017), focusing on partially observed confounders from EHR data (HbA1c lab and body mass index [BMI] values). Our analysis incorporated SMDI's descriptive functions and diagnostic tests to explore missingness patterns and determine missingness mitigation approaches. We used findings from these investigations to inform estimation of adjusted hazard ratios comparing the two classes of medications. RESULTS: High levels of missingness were noted for important confounding variables including HbA1c (63.6%) and BMI (16.5%). Diagnostic tests resulted in output that described: 1) the distributions of patient characteristics, exposure, and outcome between patients with or without an observed value of the partially observed covariate, 2) the ability to predict missingness based on observed covariates, and 3) estimate if the missingness of a partially observed covariate is differential with respect to the outcome. There was evidence that missingness could be sufficiently described using observed data, which allowed multiple imputation by chained equations using random forests to address missing confounder data in estimating treatment effects. Multiple imputation resulted in improved alignment of effect estimates with previous studies. CONCLUSIONS: We were able to demonstrate the practical application of the SMDI toolkit in a real-world setting. Application of the SMDI toolkit and the resulting insights of potential missingness patterns can inform the choice of appropriate analytic methods and increase transparency of research methods in handling missing data. This type of approach can inform analytic decision making and may increase our ability to generate evidence from real-world data.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

BMC Med Res Methodol

DOI

EISSN

1471-2288

Publication Date

October 19, 2024

Volume

24

Issue

1

Start / End Page

246

Location

England

Related Subject Headings

  • United States
  • Sodium-Glucose Transporter 2 Inhibitors
  • Pharmacoepidemiology
  • Medicare
  • Male
  • Humans
  • Glycated Hemoglobin
  • General & Internal Medicine
  • Female
  • Electronic Health Records
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Raman, S. R., Hammill, B. G., Shaw, P. A., Lee, H., Toh, S., Connolly, J. G., … Weberpals, J. (2024). Analyzing missingness patterns in real-world data using the SMDI toolkit: application to a linked EHR-claims pharmacoepidemiology study. BMC Med Res Methodol, 24(1), 246. https://doi.org/10.1186/s12874-024-02330-2
Raman, Sudha R., Bradley G. Hammill, Pamela A. Shaw, Hana Lee, Sengwee Toh, John G. Connolly, Kimberly J. Dandreo, et al. “Analyzing missingness patterns in real-world data using the SMDI toolkit: application to a linked EHR-claims pharmacoepidemiology study.BMC Med Res Methodol 24, no. 1 (October 19, 2024): 246. https://doi.org/10.1186/s12874-024-02330-2.
Raman SR, Hammill BG, Shaw PA, Lee H, Toh S, Connolly JG, et al. Analyzing missingness patterns in real-world data using the SMDI toolkit: application to a linked EHR-claims pharmacoepidemiology study. BMC Med Res Methodol. 2024 Oct 19;24(1):246.
Raman, Sudha R., et al. “Analyzing missingness patterns in real-world data using the SMDI toolkit: application to a linked EHR-claims pharmacoepidemiology study.BMC Med Res Methodol, vol. 24, no. 1, Oct. 2024, p. 246. Pubmed, doi:10.1186/s12874-024-02330-2.
Raman SR, Hammill BG, Shaw PA, Lee H, Toh S, Connolly JG, Dandreo KJ, Nalawade V, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ, Weberpals J. Analyzing missingness patterns in real-world data using the SMDI toolkit: application to a linked EHR-claims pharmacoepidemiology study. BMC Med Res Methodol. 2024 Oct 19;24(1):246.
Journal cover image

Published In

BMC Med Res Methodol

DOI

EISSN

1471-2288

Publication Date

October 19, 2024

Volume

24

Issue

1

Start / End Page

246

Location

England

Related Subject Headings

  • United States
  • Sodium-Glucose Transporter 2 Inhibitors
  • Pharmacoepidemiology
  • Medicare
  • Male
  • Humans
  • Glycated Hemoglobin
  • General & Internal Medicine
  • Female
  • Electronic Health Records