Skip to main content
Journal cover image

Improving mixed-integer temporal modeling by generating synthetic data using conditional generative adversarial networks: A case study of fluid overload prediction in the intensive care unit.

Publication ,  Journal Article
Rafiei, A; Ghiasi Rad, M; Sikora, A; Kamaleswaran, R
Published in: Comput Biol Med
January 2024

OBJECTIVE: The challenge of mixed-integer temporal data, which is particularly prominent for medication use in the critically ill, limits the performance of predictive models. The purpose of this evaluation was to pilot test integrating synthetic data within an existing dataset of complex medication data to improve machine learning model prediction of fluid overload. MATERIALS AND METHODS: This retrospective cohort study evaluated patients admitted to an ICU ≥ 72 h. Four machine learning algorithms to predict fluid overload after 48-72 h of ICU admission were developed using the original dataset. Then, two distinct synthetic data generation methodologies (synthetic minority over-sampling technique (SMOTE) and conditional tabular generative adversarial network (CTGAN)) were used to create synthetic data. Finally, a stacking ensemble technique designed to train a meta-learner was established. Models underwent training in three scenarios of varying qualities and quantities of datasets. RESULTS: Training machine learning algorithms on the combined synthetic and original dataset overall increased the performance of the predictive models compared to training on the original dataset. The highest performing model was the meta-model trained on the combined dataset with 0.83 AUROC while it managed to significantly enhance the sensitivity across different training scenarios. DISCUSSION: The integration of synthetically generated data is the first time such methods have been applied to ICU medication data and offers a promising solution to enhance the performance of machine learning models for fluid overload, which may be translated to other ICU outcomes. A meta-learner was able to make a trade-off between different performance metrics and improve the ability to identify the minority class.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Comput Biol Med

DOI

EISSN

1879-0534

Publication Date

January 2024

Volume

168

Start / End Page

107749

Location

United States

Related Subject Headings

  • Retrospective Studies
  • Intensive Care Units
  • Humans
  • Data Accuracy
  • Biomedical Engineering
  • Benchmarking
  • Algorithms
  • 4601 Applied computing
  • 4203 Health services and systems
  • 3102 Bioinformatics and computational biology
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Rafiei, A., Ghiasi Rad, M., Sikora, A., & Kamaleswaran, R. (2024). Improving mixed-integer temporal modeling by generating synthetic data using conditional generative adversarial networks: A case study of fluid overload prediction in the intensive care unit. Comput Biol Med, 168, 107749. https://doi.org/10.1016/j.compbiomed.2023.107749
Rafiei, Alireza, Milad Ghiasi Rad, Andrea Sikora, and Rishikesan Kamaleswaran. “Improving mixed-integer temporal modeling by generating synthetic data using conditional generative adversarial networks: A case study of fluid overload prediction in the intensive care unit.Comput Biol Med 168 (January 2024): 107749. https://doi.org/10.1016/j.compbiomed.2023.107749.
Journal cover image

Published In

Comput Biol Med

DOI

EISSN

1879-0534

Publication Date

January 2024

Volume

168

Start / End Page

107749

Location

United States

Related Subject Headings

  • Retrospective Studies
  • Intensive Care Units
  • Humans
  • Data Accuracy
  • Biomedical Engineering
  • Benchmarking
  • Algorithms
  • 4601 Applied computing
  • 4203 Health services and systems
  • 3102 Bioinformatics and computational biology