Combining location and expression data for principled discovery of genetic regulatory network models.

Conference Paper

We develop principled methods for the automatic induction (discovery) of genetic regulatory network models from multiple data sources and data modalities. Models of regulatory networks are represented as Bayesian networks, allowing the models to compactly and robustly capture probabilistic multivariate statistical dependencies between the various cellular factors in these networks. We build on previous Bayesian network validation results by extending the validation framework to the context of model induction, leveraging heuristic simulated annealing search algorithms and posterior model averaging. Using expression data in isolation yields results inconsistent with location data so we incorporate genomic location data to guide the model induction process. We combine these two data modalities by allowing location data to influence the model prior and expression data to influence the model likelihood. We demonstrate the utility of this approach by discovering genetic regulatory models of thirty-three variables involved in S. cerevisiae pheromone response. The models we automatically generate are consistent with the current understanding regarding this regulatory network, but also suggest new directions for future experimental investigation.

Duke Authors

Cited Authors

  • Hartemink, AJ; Gifford, DK; Jaakkola, TS; Young, RA

Published Date

  • January 2002

Published In

Start / End Page

  • 437 - 449

PubMed ID

  • 11928497

Electronic International Standard Serial Number (EISSN)

  • 2335-6936

International Standard Serial Number (ISSN)

  • 2335-6928