Assessment of the impact of EHR heterogeneity for clinical research through a case study of silent brain infarction.

Journal Article


The rapid adoption of electronic health records (EHRs) holds great promise for advancing medicine through practice-based knowledge discovery. However, the validity of EHR-based clinical research is questionable due to poor research reproducibility caused by the heterogeneity and complexity of healthcare institutions and EHR systems, the cross-disciplinary nature of the research team, and the lack of standard processes and best practices for conducting EHR-based clinical research.


We developed a data abstraction framework to standardize the process for multi-site EHR-based clinical studies aiming to enhance research reproducibility. The framework was implemented for a multi-site EHR-based research project, the ESPRESSO project, with the goal to identify individuals with silent brain infarctions (SBI) at Tufts Medical Center (TMC) and Mayo Clinic. The heterogeneity of healthcare institutions, EHR systems, documentation, and process variation in case identification was assessed quantitatively and qualitatively.


We discovered a significant variation in the patient populations, neuroimaging reporting, EHR systems, and abstraction processes across the two sites. The prevalence of SBI for patients over age 50 for TMC and Mayo is 7.4 and 12.5% respectively. There is a variation regarding neuroimaging reporting where TMC are lengthy, standardized and descriptive while Mayo's reports are short and definitive with more textual variations. Furthermore, differences in the EHR system, technology infrastructure, and data collection process were identified.


The implementation of the framework identified the institutional and process variations and the heterogeneity of EHRs across the sites participating in the case study. The experiment demonstrates the necessity to have a standardized process for data abstraction when conducting EHR-based clinical studies.

Full Text

Duke Authors

Cited Authors

  • Fu, S; Leung, LY; Raulli, A-O; Kallmes, DF; Kinsman, KA; Nelson, KB; Clark, MS; Luetmer, PH; Kingsbury, PR; Kent, DM; Liu, H

Published Date

  • March 30, 2020

Published In

Volume / Issue

  • 20 / 1

Start / End Page

  • 60 -

PubMed ID

  • 32228556

Pubmed Central ID

  • 32228556

Electronic International Standard Serial Number (EISSN)

  • 1472-6947

International Standard Serial Number (ISSN)

  • 1472-6947

Digital Object Identifier (DOI)

  • 10.1186/s12911-020-1072-9


  • eng