Combining adult with pediatric patient data to develop a clinical decision support tool intended for children: leveraging machine learning to model heterogeneity.
BACKGROUND: Clinical decision support (CDS) tools built using adult data do not typically perform well for children. We explored how best to leverage adult data to improve the performance of such tools. This study assesses whether it is better to build CDS tools for children using data from children alone or to use combined data from both adults and children. METHODS: Retrospective cohort using data from 2017 to 2020. Participants include all individuals (adults and children) receiving an elective surgery at a large academic medical center that provides adult and pediatric services. We predicted need for mechanical ventilation or admission to the intensive care unit (ICU). Predictor variables included demographic, clinical, and service utilization factors known prior to surgery. We compared predictive models built using machine learning to regression-based methods that used a pediatric or combined adult-pediatric cohort. We compared model performance based on Area Under the Receiver Operator Characteristic. RESULTS: While we found that adults and children have different risk factors, machine learning methods are able to appropriately model the underlying heterogeneity of each population and produce equally accurate predictive models whether using data only from pediatric patients or combined data from both children and adults. Results from regression-based methods were improved by the use of pediatric-specific data. CONCLUSIONS: CDS tools for children can successfully use combined data from adults and children if the model accounts for underlying heterogeneity, as in machine learning models.
Sabharwal, P; Hurst, JH; Tejwani, R; Hobbs, KT; Routh, JC; Goldstein, BA
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)