Predicting atrial fibrillation and flutter using electronic health records.


Conference Paper

Electronic Health Records (EHR) contain large amounts of useful information that could potentially be used for building models for predicting onset of diseases. In this study, we have investigated the use of free-text and coded data in Marshfield Clinic's EHR, individually and in combination for building machine learning based models to predict the first ever episode of atrial fibrillation and/or atrial flutter (AFF). We trained and evaluated our AFF models on the EHR data across different time intervals (1, 3, 5 and all years) prior to first documented onset of AFF. We applied several machine learning methods, including naïve bayes, support vector machines (SVM), logistic regression and random forests for building AFF prediction models and evaluated these using 10-fold cross-validation approach. On text-based datasets, the best model achieved an F-measure of 60.1%, when applied exclusively to coded data. The combination of textual and coded data achieved comparable performance. The study results attest to the relative merit of utilizing textual data to complement the use of coded data for disease onset prediction modeling.

Full Text

Duke Authors

Cited Authors

  • Karnik, S; Tan, SL; Berg, B; Glurich, I; Zhang, J; Vidaillet, HJ; Page, CD; Chowdhary, R

Published Date

  • 2012

Published In

Volume / Issue

  • 2012 /

Start / End Page

  • 5562 - 5565

PubMed ID

  • 23367189

Pubmed Central ID

  • 23367189

International Standard Serial Number (ISSN)

  • 1557-170X

Digital Object Identifier (DOI)

  • 10.1109/EMBC.2012.6347254

Conference Location

  • United States