CXR-TFT: Multi-modal Temporal Fusion Transformer for Predicting Chest X-Ray Trajectories
In intensive care units (ICUs), patients with complex clinical conditions require vigilant monitoring and prompt interventions. Chest X-rays (CXRs) are a vital diagnostic tool, providing insights into clinical trajectories, but their irregular acquisition limits their utility. Existing tools for CXR interpretation are constrained by cross-sectional analysis, failing to capture temporal dynamics. To address this, we introduce CXR-TFT, a novel multi-modal framework that integrates temporally sparse CXR imaging and radiology reports with high-frequency clinical data–such as vital signs, laboratory values, and respiratory flow sheets–to predict the trajectory of CXR findings in critically ill patients. CXR-TFT leverages latent embeddings from a vision encoder that are temporally aligned with hourly clinical data through interpolation. A transformer is trained to predict CXR embeddings at each hour, conditioned on previous CXR embeddings and clinical measurements. In a retrospective study of 20,000 ICU patients, CXR-TFT demonstrated 95% accuracy in predicting abnormal CXR findings 12 h before they became radiographically evident, indicating that clinical data contains valuable respiratory state progression information. By providing distinctive temporal resolution in prognostic CXR analysis, CXR-TFT offers actionable predictions with the potential to improve the management of time-sensitive critical conditions, where early intervention is crucial but timely diagnosis is challenging.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences