Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations.


Journal Article

A heterogeneous network topology possessing abundant interactions between biomedical entities has yet to be utilized in similarity-based methods for predicting drug-target associations based on the array of varying features of drugs and their targets. Deep learning reveals features of vertices of a large network that can be adapted in accommodating the similarity-based solutions to provide a flexible method of drug-target prediction.We propose a similarity-based drug-target prediction method that enhances existing association discovery methods by using a topology-based similarity measure. DeepWalk, a deep learning method, is adopted in this study to calculate the similarities within Linked Tripartite Network (LTN), a heterogeneous network generated from biomedical linked datasets. This proposed method shows promising results for drug-target association prediction: 98.96% AUC ROC score with a 10-fold cross-validation and 99.25% AUC ROC score with a Monte Carlo cross-validation with LTN. By utilizing DeepWalk, we demonstrate that: (i) this method outperforms other existing topology-based similarity computation methods, (ii) the performance is better for tripartite than with bipartite networks and (iii) the measure of similarity using network topology outperforms the ones derived from chemical structure (drugs) or genomic sequence (targets). Our proposed methodology proves to be capable of providing a promising solution for drug-target prediction based on topological similarity with a heterogeneous network, and may be readily re-purposed and adapted in the existing of similarity-based methodologies.The proposed method has been developed in JAVA and it is available, along with the data at the following URL: data are available at Bioinformatics online.

Full Text

Duke Authors

Cited Authors

  • Zong, N; Kim, H; Ngo, V; Harismendy, O

Published Date

  • August 2017

Published In

Volume / Issue

  • 33 / 15

Start / End Page

  • 2337 - 2344

PubMed ID

  • 28430977

Pubmed Central ID

  • 28430977

Electronic International Standard Serial Number (EISSN)

  • 1367-4811

International Standard Serial Number (ISSN)

  • 1367-4803

Digital Object Identifier (DOI)

  • 10.1093/bioinformatics/btx160


  • eng