Skip to main content

A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data.

Publication ,  Journal Article
Lea, AJ; Tung, J; Zhou, X
Published in: PLoS genetics
November 2015

Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

PLoS genetics

DOI

EISSN

1553-7404

ISSN

1553-7390

Publication Date

November 2015

Volume

11

Issue

11

Start / End Page

e1005650

Related Subject Headings

  • Software
  • Sequence Analysis, DNA
  • Humans
  • High-Throughput Nucleotide Sequencing
  • Developmental Biology
  • DNA Methylation
  • CpG Islands
  • Algorithms
  • 3105 Genetics
  • 0604 Genetics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Lea, A. J., Tung, J., & Zhou, X. (2015). A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data. PLoS Genetics, 11(11), e1005650. https://doi.org/10.1371/journal.pgen.1005650
Lea, Amanda J., Jenny Tung, and Xiang Zhou. “A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data.PLoS Genetics 11, no. 11 (November 2015): e1005650. https://doi.org/10.1371/journal.pgen.1005650.
Lea, Amanda J., et al. “A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data.PLoS Genetics, vol. 11, no. 11, Nov. 2015, p. e1005650. Epmc, doi:10.1371/journal.pgen.1005650.

Published In

PLoS genetics

DOI

EISSN

1553-7404

ISSN

1553-7390

Publication Date

November 2015

Volume

11

Issue

11

Start / End Page

e1005650

Related Subject Headings

  • Software
  • Sequence Analysis, DNA
  • Humans
  • High-Throughput Nucleotide Sequencing
  • Developmental Biology
  • DNA Methylation
  • CpG Islands
  • Algorithms
  • 3105 Genetics
  • 0604 Genetics