Skip to main content
Journal cover image

Using Bayesian networks to discover relations between genes, environment, and disease.

Publication ,  Journal Article
Su, C; Andrew, A; Karagas, MR; Borsuk, ME
Published in: BioData mining
March 2013

We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application to data from a population-based study of bladder cancer in New Hampshire, USA. For didactical purposes, we intentionally keep this example simple. When applied to complete data records, we find only minor differences in the performance and results of different algorithms. Subsequent incorporation of partial records through application of the EM algorithm gives us greater power to detect relations. Allowing for network structures that depart from a strict causal interpretation also enhances our ability to discover complex associations including gene-gene (epistasis) and gene-environment interactions. While BNs are already powerful tools for the genetic dissection of disease and generation of prognostic models, there remain some conceptual and computational challenges. These include the proper handling of continuous variables and unmeasured factors, the explicit incorporation of prior knowledge, and the evaluation and communication of the robustness of substantive conclusions to alternative assumptions and data manifestations.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

BioData mining

DOI

EISSN

1756-0381

ISSN

1756-0381

Publication Date

March 2013

Volume

6

Issue

1

Start / End Page

6

Related Subject Headings

  • 4605 Data management and data science
  • 3102 Bioinformatics and computational biology
  • 1303 Specialist Studies in Education
  • 1101 Medical Biochemistry and Metabolomics
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Su, C., Andrew, A., Karagas, M. R., & Borsuk, M. E. (2013). Using Bayesian networks to discover relations between genes, environment, and disease. BioData Mining, 6(1), 6. https://doi.org/10.1186/1756-0381-6-6
Su, Chengwei, Angeline Andrew, Margaret R. Karagas, and Mark E. Borsuk. “Using Bayesian networks to discover relations between genes, environment, and disease.BioData Mining 6, no. 1 (March 2013): 6. https://doi.org/10.1186/1756-0381-6-6.
Su C, Andrew A, Karagas MR, Borsuk ME. Using Bayesian networks to discover relations between genes, environment, and disease. BioData mining. 2013 Mar;6(1):6.
Su, Chengwei, et al. “Using Bayesian networks to discover relations between genes, environment, and disease.BioData Mining, vol. 6, no. 1, Mar. 2013, p. 6. Epmc, doi:10.1186/1756-0381-6-6.
Su C, Andrew A, Karagas MR, Borsuk ME. Using Bayesian networks to discover relations between genes, environment, and disease. BioData mining. 2013 Mar;6(1):6.
Journal cover image

Published In

BioData mining

DOI

EISSN

1756-0381

ISSN

1756-0381

Publication Date

March 2013

Volume

6

Issue

1

Start / End Page

6

Related Subject Headings

  • 4605 Data management and data science
  • 3102 Bioinformatics and computational biology
  • 1303 Specialist Studies in Education
  • 1101 Medical Biochemistry and Metabolomics
  • 0801 Artificial Intelligence and Image Processing