Skip to main content
Journal cover image

Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.

Publication ,  Journal Article
Park, J; Artin, MG; Lee, KE; Pumpalova, YS; Ingram, MA; May, BL; Park, M; Hur, C; Tatonetti, NP
Published in: Journal of biomedical informatics
July 2022

The multi-modal and unstructured nature of observational data in Electronic Health Records (EHR) is currently a significant obstacle for the application of machine learning towards risk stratification. In this study, we develop a deep learning framework for incorporating longitudinal clinical data from EHR to infer risk for pancreatic cancer (PC). This framework includes a novel training protocol, which enforces an emphasis on early detection by applying an independent Poisson-random mask on proximal-time measurements for each variable. Data fusion for irregular multivariate time-series features is enabled by a "grouped" neural network (GrpNN) architecture, which uses representation learning to generate a dimensionally reduced vector for each measurement set before making a final prediction. These models were evaluated using EHR data from Columbia University Irving Medical Center-New York Presbyterian Hospital. Our framework demonstrated better performance on early detection (AUROC 0.671, CI 95% 0.667 - 0.675, p < 0.001) at 12 months prior to diagnosis compared to a logistic regression, xgboost, and a feedforward neural network baseline. We demonstrate that our masking strategy results greater improvements at distal times prior to diagnosis, and that our GrpNN model improves generalizability by reducing overfitting relative to the feedforward baseline. The results were consistent across reported race. Our proposed algorithm is potentially generalizable to other diseases including but not limited to cancer where early detection can improve survival.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Journal of biomedical informatics

DOI

EISSN

1532-0480

ISSN

1532-0464

Publication Date

July 2022

Volume

131

Start / End Page

104095

Related Subject Headings

  • Time Factors
  • Pancreatic Neoplasms
  • Medical Informatics
  • Humans
  • Electronic Health Records
  • Early Detection of Cancer
  • Deep Learning
  • Biomedical Engineering
  • 4601 Applied computing
  • 4203 Health services and systems
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Park, J., Artin, M. G., Lee, K. E., Pumpalova, Y. S., Ingram, M. A., May, B. L., … Tatonetti, N. P. (2022). Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer. Journal of Biomedical Informatics, 131, 104095. https://doi.org/10.1016/j.jbi.2022.104095
Park, Jiheum, Michael G. Artin, Kate E. Lee, Yoanna S. Pumpalova, Myles A. Ingram, Benjamin L. May, Michael Park, Chin Hur, and Nicholas P. Tatonetti. “Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.Journal of Biomedical Informatics 131 (July 2022): 104095. https://doi.org/10.1016/j.jbi.2022.104095.
Park J, Artin MG, Lee KE, Pumpalova YS, Ingram MA, May BL, et al. Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer. Journal of biomedical informatics. 2022 Jul;131:104095.
Park, Jiheum, et al. “Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.Journal of Biomedical Informatics, vol. 131, July 2022, p. 104095. Epmc, doi:10.1016/j.jbi.2022.104095.
Park J, Artin MG, Lee KE, Pumpalova YS, Ingram MA, May BL, Park M, Hur C, Tatonetti NP. Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer. Journal of biomedical informatics. 2022 Jul;131:104095.
Journal cover image

Published In

Journal of biomedical informatics

DOI

EISSN

1532-0480

ISSN

1532-0464

Publication Date

July 2022

Volume

131

Start / End Page

104095

Related Subject Headings

  • Time Factors
  • Pancreatic Neoplasms
  • Medical Informatics
  • Humans
  • Electronic Health Records
  • Early Detection of Cancer
  • Deep Learning
  • Biomedical Engineering
  • 4601 Applied computing
  • 4203 Health services and systems