Random forest prediction of Alzheimer's disease using pairwise selection from time series data.

Published online

Journal Article

Time-dependent data collected in studies of Alzheimer's disease usually has missing and irregularly sampled data points. For this reason time series methods which assume regular sampling cannot be applied directly to the data without a pre-processing step. In this paper we use a random forest to learn the relationship between pairs of data points at different time separations. The input vector is a summary of the time series history and it includes both demographic and non-time varying variables such as genetic data. To test the method we use data from the TADPOLE grand challenge, an initiative which aims to predict the evolution of subjects at risk of Alzheimer's disease using demographic, physical and cognitive input data. The task is to predict diagnosis, ADAS-13 score and normalised ventricles volume. While the competition proceeds, forecasting methods may be compared using a leaderboard dataset selected from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and with standard metrics for measuring accuracy. For diagnosis, we find an mAUC of 0.82, and a classification accuracy of 0.73 compared with a benchmark SVM predictor which gives mAUC = 0.62 and BCA = 0.52. The results show that the method is effective and comparable with other methods.

Full Text

Duke Authors

Cited Authors

  • Moore, PJ; Lyons, TJ; Gallacher, J; Alzheimer’s Disease Neuroimaging Initiative,

Published Date

  • 2019

Published In

Volume / Issue

  • 14 / 2

Start / End Page

  • e0211558 -

PubMed ID

  • 30763336

Pubmed Central ID

  • 30763336

Electronic International Standard Serial Number (EISSN)

  • 1932-6203

Digital Object Identifier (DOI)

  • 10.1371/journal.pone.0211558

Language

  • eng

Conference Location

  • United States