Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.


Journal Article

Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism's entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse - affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.

Full Text

Cited Authors

  • Runcie, DE; Mukherjee, S

Published Date

  • July 2013

Published In

Volume / Issue

  • 194 / 3

Start / End Page

  • 753 - 767

Pubmed Central ID

  • 23636737

Electronic International Standard Serial Number (EISSN)

  • 1943-2631

International Standard Serial Number (ISSN)

  • 0016-6731

Digital Object Identifier (DOI)

  • 10.1534/genetics.113.151217


  • eng