Predicting mortality over different time horizons: which data elements are needed?
OBJECTIVE: Electronic health records (EHRs) are a resource for "big data" analytics, containing a variety of data elements. We investigate how different categories of information contribute to prediction of mortality over different time horizons among patients undergoing hemodialysis treatment. MATERIAL AND METHODS: We derived prediction models for mortality over 7 time horizons using EHR data on older patients from a national chain of dialysis clinics linked with administrative data using LASSO (least absolute shrinkage and selection operator) regression. We assessed how different categories of information relate to risk assessment and compared discrete models to time-to-event models. RESULTS: The best predictors used all the available data (c-statistic ranged from 0.72-0.76), with stronger models in the near term. While different variable groups showed different utility, exclusion of any particular group did not lead to a meaningfully different risk assessment. Discrete time models performed better than time-to-event models. CONCLUSIONS: Different variable groups were predictive over different time horizons, with vital signs most predictive for near-term mortality and demographic and comorbidities more important in long-term mortality.
Goldstein, BA; Pencina, MJ; Montez-Rath, ME; Winkelmayer, WC
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)