Advancing real-time infectious disease forecasting using large language models.
Forecasting the short-term spread of an ongoing disease outbreak poses a challenge owing to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables, and the intersection of public policy and human behavior. Here we introduce PandemicLLM, a framework with multi-modal large language models (LLMs) that reformulates real-time forecasting of disease spread as a text-reasoning problem, with the ability to incorporate real-time, complex, non-numerical information. This approach, through an artificial intelligence-human cooperative prompt design and time-series representation learning, encodes multi-modal data for LLMs. The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial and epidemiological time-series data, and is tested across all 50 states of the United States for a duration of 19 months. PandemicLLM opens avenues for incorporating various pandemic-related data in heterogeneous formats and shows performance benefits over existing models.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- United States
- SARS-CoV-2
- Pandemics
- Large Language Models
- Language
- Humans
- Forecasting
- Communicable Diseases
- COVID-19
- Artificial Intelligence
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- United States
- SARS-CoV-2
- Pandemics
- Large Language Models
- Language
- Humans
- Forecasting
- Communicable Diseases
- COVID-19
- Artificial Intelligence