Advanced artificial intelligence vs simpler models for 1-year death prediction among patients receiving hemodialysis.
OBJECTIVES: We evaluated the data requirement for modern AI tools to outperform simpler models in predicting short-term mortality in over 500 000 patients with hemodialysis-dependent kidney failure. MATERIALS AND METHODS: We compared logistic regression, boosting, and transformers using increasingly complex feature sets (from last-visit data to full trajectories). Performance was measured using the area under the ROC curve (AUC-ROC) and the Precision-Recall curve (AUC-PR) across training data sizes ranging from 500 to 490 197 samples. RESULTS: Using features with temporal information is beneficial across all models. On the full dataset, Transformers (AUC-ROC = 0.8568) and boosting (AUC-ROC = 0.8598) perform similarly. DISCUSSION: Transformers require large datasets to outperform simpler models like boosting, limiting their usefulness in smaller datasets, even on datasets as big as 500K. CONCLUSION: Modern AI tools require substantial data to justify their computational cost over simpler approaches. However, a more complex feature set seems to be beneficial across all models.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- 4203 Health services and systems
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- 4203 Health services and systems