Retrieval-augmented generation for interpreting clinical laboratory regulations using large language models
Large language models (LLMs) have demonstrated strong performance on general knowledge tasks, but they have important limitations as standalone tools for question answering in specialized domains where accuracy and consistency are critical. Retrieval-augmented generation (RAG) is a strategy in which LLM outputs are grounded in dynamically retrieved source documents, offering advantages in accuracy, explainability, and maintainability. We developed and evaluated a custom RAG system called Raven, designed to answer laboratory regulatory questions using the part of the Code of Federal Regulations (CFR) pertaining to laboratory (42 CFR Part 493) as an authoritative source. Raven employed a vector search pipeline and a LLM to generate grounded responses via a chatbot–style interface. The system was tested using 103 synthetic laboratory regulatory questions, 88 of which were explicitly addressed in the CFR. Compared to answers generated manually by a board-certified pathologist, Raven's responses were judged to be totally complete and correct in 92.0% of those 88 cases, with little irrelevant content and a low potential for regulatory or medical error. Performance declined significantly on questions not addressed in the CFR, confirming the system's grounding in the source documents. Most suboptimal responses were attributable to faulty source document retrieval rather than model hallucination or misinterpretation. These findings demonstrate that a basic RAG system can produce useful, accurate, and verifiable answers to complex regulatory questions. With appropriate safeguards and with thoughtful integration into user workflows, tools like Raven may serve as valuable decision-support systems in laboratory medicine and other knowledge-intensive healthcare domains.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Related Subject Headings
- 4609 Information systems
- 3102 Bioinformatics and computational biology
- 0601 Biochemistry and Cell Biology
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Related Subject Headings
- 4609 Information systems
- 3102 Bioinformatics and computational biology
- 0601 Biochemistry and Cell Biology