Scholars@Duke publication: Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs

Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs

Publication , Journal Article

Wang, Y; Chen, Q; Zhang, H; Wang, W; Wang, Q; Pan, Y; Xie, L; Huang, K; Nguyen, A

Published in: ACM Transactions on Intelligent Systems and Technology

November 4, 2024

The rapid growth of biomedical publications has presented significant challenges in the field of information retrieval. Most existing work focuses on document retrieval given explicit queries. However, in real applications such as curated biomedical database maintenance, explicit queries are missing. In this paper, we propose a two-step model for biomedical information retrieval in the case that only a small set of example documents is available without explicit queries. Initially, we extract keywords from the observed documents using large pre-trained language models and biomedical knowledge graphs. These keywords are then enriched with domain-specific entities. Information retrieval techniques can subsequently use the collected entities to rank the documents. Following this, we introduce an iterative Positive-Unlabeled learning method to classify all unlabeled documents. Experiments conducted on the PubMed dataset demonstrate that the proposed technique outperforms the state-of-the-art positive-unlabeled learning methods. The results underscore the effectiveness of integrating large language models and biomedical knowledge graphs in improving zero-shot information retrieval performance in the biomedical domain.

Duke Scholars

Author Kaizhu Huang DKU Faculty

Published In

ACM Transactions on Intelligent Systems and Technology

DOI

10.1145/3702647

EISSN

2157-6912

ISSN

2157-6904

Publication Date

November 4, 2024

Publisher

Association for Computing Machinery (ACM)

Related Subject Headings

4611 Machine learning
4602 Artificial intelligence
0806 Information Systems
0801 Artificial Intelligence and Image Processing

Citation

APA

Chicago

ICMJE

MLA

NLM

Wang, Y., Chen, Q., Zhang, H., Wang, W., Wang, Q., Pan, Y., … Nguyen, A. (2024). Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs. ACM Transactions on Intelligent Systems and Technology. https://doi.org/10.1145/3702647

Wang, Yuqi, Qiuyi Chen, Haiyang Zhang, Wei Wang, Qiufeng Wang, Yushan Pan, Liangru Xie, Kaizhu Huang, and Anh Nguyen. “Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs.” ACM Transactions on Intelligent Systems and Technology, November 4, 2024. https://doi.org/10.1145/3702647.

Wang Y, Chen Q, Zhang H, Wang W, Wang Q, Pan Y, et al. Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs. ACM Transactions on Intelligent Systems and Technology. 2024 Nov 4;

Wang, Yuqi, et al. “Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs.” ACM Transactions on Intelligent Systems and Technology, Association for Computing Machinery (ACM), Nov. 2024. Crossref, doi:10.1145/3702647.

Wang Y, Chen Q, Zhang H, Wang W, Wang Q, Pan Y, Xie L, Huang K, Nguyen A. Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs. ACM Transactions on Intelligent Systems and Technology. Association for Computing Machinery (ACM); 2024 Nov 4;

Published In

ACM Transactions on Intelligent Systems and Technology

DOI

10.1145/3702647

EISSN

2157-6912

ISSN

2157-6904

Publication Date

November 4, 2024

Publisher

Association for Computing Machinery (ACM)

Related Subject Headings

4611 Machine learning
4602 Artificial intelligence
0806 Information Systems
0801 Artificial Intelligence and Image Processing