Skip to main content
Journal cover image

Towards a content agnostic computable knowledge repository for data quality assessment.

Publication ,  Journal Article
Rajan, NS; Gouripeddi, R; Mo, P; Madsen, RK; Facelli, JC
Published in: Comput Methods Programs Biomed
August 2019

BACKGROUND AND OBJECTIVE: In recent years, several data quality conceptual frameworks have been proposed across the Data Quality and Information Quality domains towards assessment of quality of data. These frameworks are diverse, varying from simple lists of concepts to complex ontological and taxonomical representations of data quality concepts. The goal of this study is to design, develop and implement a platform agnostic computable data quality knowledge repository for data quality assessments. METHODS: We identified computable data quality concepts by performing a comprehensive literature review of articles indexed in three major bibliographic data sources. From this corpus, we extracted data quality concepts, their definitions, applicable measures, their computability and identified conceptual relationships. We used these relationships to design and develop a data quality meta-model and implemented it in a quality knowledge repository. RESULTS: We identified three primitives for programmatically performing data quality assessments: data quality concept, its definition, its measure or rule for data quality assessment, and their associations. We modeled a computable data quality meta-data repository and extended this framework to adapt, store, retrieve and automate assessment of other existing data quality assessment models. CONCLUSION: We identified research gaps in data quality literature towards automating data quality assessments methods. In this process, we designed, developed and implemented a computable data quality knowledge repository for assessing quality and characterizing data in health data repositories. We leverage this knowledge repository in a service-oriented architecture to perform scalable and reproducible framework for data quality assessments in disparate biomedical data sources.

Duke Scholars

Published In

Comput Methods Programs Biomed

DOI

EISSN

1872-7565

Publication Date

August 2019

Volume

177

Start / End Page

193 / 201

Location

Ireland

Related Subject Headings

  • User-Computer Interface
  • Software
  • Signal Processing, Computer-Assisted
  • Research Design
  • Reproducibility of Results
  • Quality Control
  • Publications
  • Programming Languages
  • Pattern Recognition, Automated
  • Medical Informatics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Rajan, N. S., Gouripeddi, R., Mo, P., Madsen, R. K., & Facelli, J. C. (2019). Towards a content agnostic computable knowledge repository for data quality assessment. Comput Methods Programs Biomed, 177, 193–201. https://doi.org/10.1016/j.cmpb.2019.05.017
Rajan, Naresh Sundar, Ramkiran Gouripeddi, Peter Mo, Randy K. Madsen, and Julio C. Facelli. “Towards a content agnostic computable knowledge repository for data quality assessment.Comput Methods Programs Biomed 177 (August 2019): 193–201. https://doi.org/10.1016/j.cmpb.2019.05.017.
Rajan NS, Gouripeddi R, Mo P, Madsen RK, Facelli JC. Towards a content agnostic computable knowledge repository for data quality assessment. Comput Methods Programs Biomed. 2019 Aug;177:193–201.
Rajan, Naresh Sundar, et al. “Towards a content agnostic computable knowledge repository for data quality assessment.Comput Methods Programs Biomed, vol. 177, Aug. 2019, pp. 193–201. Pubmed, doi:10.1016/j.cmpb.2019.05.017.
Rajan NS, Gouripeddi R, Mo P, Madsen RK, Facelli JC. Towards a content agnostic computable knowledge repository for data quality assessment. Comput Methods Programs Biomed. 2019 Aug;177:193–201.
Journal cover image

Published In

Comput Methods Programs Biomed

DOI

EISSN

1872-7565

Publication Date

August 2019

Volume

177

Start / End Page

193 / 201

Location

Ireland

Related Subject Headings

  • User-Computer Interface
  • Software
  • Signal Processing, Computer-Assisted
  • Research Design
  • Reproducibility of Results
  • Quality Control
  • Publications
  • Programming Languages
  • Pattern Recognition, Automated
  • Medical Informatics