Label efficient phenotyping for Long COVID using electronic health records.
Long COVID poses a significant disease burden globally, but its heterogeneous presentation and unreliable coding practices render it difficult to study. Developing efficient phenotyping algorithms is crucial to enabling risk prediction and effective management of Long COVID. We introduce the LAbel-efficienT Long COVID pHenotyping (LATCH) algorithm, which synthesizes a small number of gold-standard labels and a large, unlabeled dataset with many electronic health record (EHR) features. Both internal validation and external validation demonstrated the superior performance of LATCH over methods using the U09.9 Long COVID EHR code alone. Our downstream analysis revealed a pattern of elevated healthcare utilization due to Long COVID, peaking at and continuing beyond the fourth month following COVID infection. LATCH enhances the classification of Long COVID by fully utilizing both labeled and unlabeled data, providing vital insights into healthcare utilization trends, informing clinical and public health responses to the enduring consequences of COVID-19.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- 4203 Health services and systems
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- 4203 Health services and systems