Scholars@Duke publication: Building gold standard corpora for medical natural language processing tasks.

Building gold standard corpora for medical natural language processing tasks.

Publication , Journal Article

Deleger, L; Li, Q; Lingren, T; Kaiser, M; Molnar, K; Stoutenborough, L; Kouril, M; Marsolo, K; Solti, I

Published in: AMIA Annu Symp Proc

2012

We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.

Duke Scholars

Author Keith Allen Marsolo Population Health Sciences

Published In

AMIA Annu Symp Proc

EISSN

1942-597X

Publication Date

2012

Volume

2012

Start / End Page

144 / 153

Location

United States

Related Subject Headings

United States Food and Drug Administration
United States
Software
Natural Language Processing
Medical Records
Drug Labeling
Clinical Trials as Topic

Citation

APA

Chicago

ICMJE

MLA

NLM

Deleger, L., Li, Q., Lingren, T., Kaiser, M., Molnar, K., Stoutenborough, L., … Solti, I. (2012). Building gold standard corpora for medical natural language processing tasks. AMIA Annu Symp Proc, 2012, 144–153.

Deleger, Louise, Qi Li, Todd Lingren, Megan Kaiser, Katalin Molnar, Laura Stoutenborough, Michal Kouril, Keith Marsolo, and Imre Solti. “Building gold standard corpora for medical natural language processing tasks.” AMIA Annu Symp Proc 2012 (2012): 144–53.

Deleger L, Li Q, Lingren T, Kaiser M, Molnar K, Stoutenborough L, et al. Building gold standard corpora for medical natural language processing tasks. AMIA Annu Symp Proc. 2012;2012:144–53.

Deleger, Louise, et al. “Building gold standard corpora for medical natural language processing tasks.” AMIA Annu Symp Proc, vol. 2012, 2012, pp. 144–53.

Deleger L, Li Q, Lingren T, Kaiser M, Molnar K, Stoutenborough L, Kouril M, Marsolo K, Solti I. Building gold standard corpora for medical natural language processing tasks. AMIA Annu Symp Proc. 2012;2012:144–153.

Published In

AMIA Annu Symp Proc

EISSN

1942-597X

Publication Date

2012

Volume

2012

Start / End Page

144 / 153

Location

United States

Related Subject Headings

United States Food and Drug Administration
United States
Software
Natural Language Processing
Medical Records
Drug Labeling
Clinical Trials as Topic