Skip to main content

A machine learning model for predicting congenital heart defects from administrative data.

Publication ,  Journal Article
Shi, H; Book, W; Raskind-Hood, C; Downing, KF; Farr, SL; Bell, MN; Sameni, R; Rodriguez, FH; Kamaleswaran, R
Published in: Birth Defects Res
November 1, 2023

INTRODUCTION: International Classification of Diseases (ICD) codes recorded in administrative data are often used to identify congenital heart defects (CHD). However, these codes may inaccurately identify true positive (TP) CHD individuals. CHD surveillance could be strengthened by accurate CHD identification in administrative records using machine learning (ML) algorithms. METHODS: To identify features relevant to accurate CHD identification, traditional ML models were applied to a validated dataset of 779 patients; encounter level data, including ICD-9-CM and CPT codes, from 2011 to 2013 at four US sites were utilized. Five-fold cross-validation determined overlapping important features that best predicted TP CHD individuals. Median values and 95% confidence intervals (CIs) of area under the receiver operating curve, positive predictive value (PPV), negative predictive value, sensitivity, specificity, and F1-score were compared across four ML models: Logistic Regression, Gaussian Naive Bayes, Random Forest, and eXtreme Gradient Boosting (XGBoost). RESULTS: Baseline PPV was 76.5% from expert clinician validation of ICD-9-CM CHD-related codes. Feature selection for ML decreased 7138 features to 10 that best predicted TP CHD cases. During training and testing, XGBoost performed the best in median accuracy (F1-score) and PPV, 0.84 (95% CI: 0.76, 0.91) and 0.94 (95% CI: 0.91, 0.96), respectively. When applied to the entire dataset, XGBoost revealed a median PPV of 0.94 (95% CI: 0.94, 0.95). CONCLUSIONS: Applying ML algorithms improved the accuracy of identifying TP CHD cases in comparison to ICD codes alone. Use of this technique to identify CHD cases would improve generalizability of results obtained from large datasets to the CHD patient population, enhancing public health surveillance efforts.

Duke Scholars

Published In

Birth Defects Res

DOI

EISSN

2472-1727

Publication Date

November 1, 2023

Volume

115

Issue

18

Start / End Page

1693 / 1707

Location

United States

Related Subject Headings

  • Predictive Value of Tests
  • Machine Learning
  • Humans
  • Heart Defects, Congenital
  • Bayes Theorem
  • Algorithms
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Shi, H., Book, W., Raskind-Hood, C., Downing, K. F., Farr, S. L., Bell, M. N., … Kamaleswaran, R. (2023). A machine learning model for predicting congenital heart defects from administrative data. Birth Defects Res, 115(18), 1693–1707. https://doi.org/10.1002/bdr2.2245
Shi, Haoming, Wendy Book, Cheryl Raskind-Hood, Karrie F. Downing, Sherry L. Farr, Mary N. Bell, Reza Sameni, Fred H. Rodriguez, and Rishikesan Kamaleswaran. “A machine learning model for predicting congenital heart defects from administrative data.Birth Defects Res 115, no. 18 (November 1, 2023): 1693–1707. https://doi.org/10.1002/bdr2.2245.
Shi H, Book W, Raskind-Hood C, Downing KF, Farr SL, Bell MN, et al. A machine learning model for predicting congenital heart defects from administrative data. Birth Defects Res. 2023 Nov 1;115(18):1693–707.
Shi, Haoming, et al. “A machine learning model for predicting congenital heart defects from administrative data.Birth Defects Res, vol. 115, no. 18, Nov. 2023, pp. 1693–707. Pubmed, doi:10.1002/bdr2.2245.
Shi H, Book W, Raskind-Hood C, Downing KF, Farr SL, Bell MN, Sameni R, Rodriguez FH, Kamaleswaran R. A machine learning model for predicting congenital heart defects from administrative data. Birth Defects Res. 2023 Nov 1;115(18):1693–1707.

Published In

Birth Defects Res

DOI

EISSN

2472-1727

Publication Date

November 1, 2023

Volume

115

Issue

18

Start / End Page

1693 / 1707

Location

United States

Related Subject Headings

  • Predictive Value of Tests
  • Machine Learning
  • Humans
  • Heart Defects, Congenital
  • Bayes Theorem
  • Algorithms