Skip to main content
Journal cover image

AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data.

Publication ,  Journal Article
Yuan, H; Xie, F; Ong, MEH; Ning, Y; Chee, ML; Saffari, SE; Abdullah, HR; Goldstein, BA; Chakraborty, B; Liu, N
Published in: J Biomed Inform
May 2022

BACKGROUND: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among various decision-making models to determine the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. However, its current framework still leaves room for improvement when addressing unbalanced data of rare events. METHODS: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. Baseline techniques for performance comparison included the original AutoScore, full logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), full random forest, and random forest with a reduced number of variables. These models were evaluated based on their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches to predict inpatient mortality. RESULTS: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839), while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.801). The AutoScore-Imbalance sub-model (using a down-sampling algorithm) yielded an AUC of 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Furthermore, AutoScore-Imbalance obtained the highest balanced accuracy of 0.757 (0.702-0.805), compared to 0.698 (0.643-0.753) by the original AutoScore and the maximum of 0.720 (0.664-0.769) by other baseline models. CONCLUSIONS: We have developed an interpretable tool to handle clinical data imbalance, presented its structure, and demonstrated its superiority over baselines. The AutoScore-Imbalance tool can be applied to highly unbalanced datasets to gain further insight into rare medical events and facilitate real-world clinical decision-making.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

J Biomed Inform

DOI

EISSN

1532-0480

Publication Date

May 2022

Volume

129

Start / End Page

104072

Location

United States

Related Subject Headings

  • ROC Curve
  • Medical Informatics
  • Machine Learning
  • Logistic Models
  • Clinical Decision-Making
  • Biomedical Engineering
  • Algorithms
  • 4601 Applied computing
  • 4203 Health services and systems
  • 11 Medical and Health Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Yuan, H., Xie, F., Ong, M. E. H., Ning, Y., Chee, M. L., Saffari, S. E., … Liu, N. (2022). AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data. J Biomed Inform, 129, 104072. https://doi.org/10.1016/j.jbi.2022.104072
Yuan, Han, Feng Xie, Marcus Eng Hock Ong, Yilin Ning, Marcel Lucas Chee, Seyed Ehsan Saffari, Hairil Rizal Abdullah, Benjamin Alan Goldstein, Bibhas Chakraborty, and Nan Liu. “AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data.J Biomed Inform 129 (May 2022): 104072. https://doi.org/10.1016/j.jbi.2022.104072.
Yuan H, Xie F, Ong MEH, Ning Y, Chee ML, Saffari SE, et al. AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data. J Biomed Inform. 2022 May;129:104072.
Yuan, Han, et al. “AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data.J Biomed Inform, vol. 129, May 2022, p. 104072. Pubmed, doi:10.1016/j.jbi.2022.104072.
Yuan H, Xie F, Ong MEH, Ning Y, Chee ML, Saffari SE, Abdullah HR, Goldstein BA, Chakraborty B, Liu N. AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data. J Biomed Inform. 2022 May;129:104072.
Journal cover image

Published In

J Biomed Inform

DOI

EISSN

1532-0480

Publication Date

May 2022

Volume

129

Start / End Page

104072

Location

United States

Related Subject Headings

  • ROC Curve
  • Medical Informatics
  • Machine Learning
  • Logistic Models
  • Clinical Decision-Making
  • Biomedical Engineering
  • Algorithms
  • 4601 Applied computing
  • 4203 Health services and systems
  • 11 Medical and Health Sciences