Concept Bag: A New Method for Computing Concept Similarity in Biomedical Data
Biomedical data are a rich source of information and knowledge, not only for direct patient care, but also for secondary use in population health, clinical research, and translational research. Biomedical data are typically scattered across multiple systems and syntactic and semantic data integration is necessary to fully utilize the data’s potential. This paper introduces new algorithms that were devised to support automatic and semi-automatic integration of semantically heterogeneous biomedical data. The new algorithms incorporate both data mining and biomedical informatics methods to create “concept bags” in the same way that “word bags” are used in data mining and text retrieval. The methods are highly configurable and were tested in five different ways on different types of biomedical data. The new methods performed well in computing similarity between medical terms and data elements - both critical for semi/automatic data integration operations.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences