Combining location and expression data for principled discovery of genetic regulatory network models.
We develop principled methods for the automatic induction (discovery) of genetic regulatory network models from multiple data sources and data modalities. Models of regulatory networks are represented as Bayesian networks, allowing the models to compactly and robustly capture probabilistic multivariate statistical dependencies between the various cellular factors in these networks. We build on previous Bayesian network validation results by extending the validation framework to the context of model induction, leveraging heuristic simulated annealing search algorithms and posterior model averaging. Using expression data in isolation yields results inconsistent with location data so we incorporate genomic location data to guide the model induction process. We combine these two data modalities by allowing location data to influence the model prior and expression data to influence the model likelihood. We demonstrate the utility of this approach by discovering genetic regulatory models of thirty-three variables involved in S. cerevisiae pheromone response. The models we automatically generate are consistent with the current understanding regarding this regulatory network, but also suggest new directions for future experimental investigation.
Duke Scholars
Published In
EISSN
ISSN
Publication Date
Start / End Page
Related Subject Headings
- Software
- Regulatory Sequences, Nucleic Acid
- Neural Networks, Computer
- Multivariate Analysis
- Models, Genetic
- GTP-Binding Proteins
- Chromosome Mapping
- Bayes Theorem
Citation
Published In
EISSN
ISSN
Publication Date
Start / End Page
Related Subject Headings
- Software
- Regulatory Sequences, Nucleic Acid
- Neural Networks, Computer
- Multivariate Analysis
- Models, Genetic
- GTP-Binding Proteins
- Chromosome Mapping
- Bayes Theorem