Skip to main content
Journal cover image

A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.

Publication ,  Journal Article
Yan, KK; Zhao, H; Pang, H
Published in: BMC Bioinformatics
December 6, 2017

BACKGROUND: High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. RESULTS: In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. CONCLUSIONS: The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

BMC Bioinformatics

DOI

EISSN

1471-2105

Publication Date

December 6, 2017

Volume

18

Issue

1

Start / End Page

539

Location

England

Related Subject Headings

  • Support Vector Machine
  • Humans
  • Computational Biology
  • Bioinformatics
  • Bayes Theorem
  • Algorithms
  • 49 Mathematical sciences
  • 46 Information and computing sciences
  • 31 Biological sciences
  • 08 Information and Computing Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Yan, K. K., Zhao, H., & Pang, H. (2017). A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits. BMC Bioinformatics, 18(1), 539. https://doi.org/10.1186/s12859-017-1982-4
Yan, Kang K., Hongyu Zhao, and Herbert Pang. “A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.BMC Bioinformatics 18, no. 1 (December 6, 2017): 539. https://doi.org/10.1186/s12859-017-1982-4.
Yan, Kang K., et al. “A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.BMC Bioinformatics, vol. 18, no. 1, Dec. 2017, p. 539. Pubmed, doi:10.1186/s12859-017-1982-4.
Journal cover image

Published In

BMC Bioinformatics

DOI

EISSN

1471-2105

Publication Date

December 6, 2017

Volume

18

Issue

1

Start / End Page

539

Location

England

Related Subject Headings

  • Support Vector Machine
  • Humans
  • Computational Biology
  • Bioinformatics
  • Bayes Theorem
  • Algorithms
  • 49 Mathematical sciences
  • 46 Information and computing sciences
  • 31 Biological sciences
  • 08 Information and Computing Sciences