A Bayesian Hierarchical Model to Estimate DNA Methylation Conservation in Colorectal Tumors.

Journal Article (Journal Article)

MOTIVATION: Conservation is broadly used to identify biologically important (epi)genomic regions. In the case of tumor growth, preferential conservation of DNA methylation can be used to identify areas of particular functional importance to the tumor. However, reliable assessment of methylation conservation based on multiple tissue samples per patient requires the decomposition of methylation variation at multiple levels. RESULTS: We developed a Bayesian hierarchical model that allows for variance decomposition of methylation on three levels: between-patient normal tissue variation, between-patient tumor-effect variation, and within-patient tumor variation. We then defined a model-based conservation score to identify loci of reduced within-tumor methylation variation relative to between-patient variation. We fit the model to multi-sample methylation array data from 21 colorectal cancer (CRC) patients using a Monte Carlo Markov Chain algorithm (Stan). Sets of genes implicated in CRC tumorigenesis exhibited preferential conservation, demonstrating the model's ability to identify functionally relevant genes based on methylation conservation. A pathway analysis of preferentially conserved genes implicated several CRC relevant pathways and pathways related to neoantigen presentation and immune evasion. CONCLUSIONS: Our findings suggest that preferential methylation conservation may be used to identify novel gene targets that are not consistently mutated in CRC. The flexible structure makes the model amenable to the analysis of more complex multi-sample data structures. AVAILABILITY: The data underlying this article are available in the NCBI GEO Database, under accession code GSE166212. The R analysis code is available at https://github.com/kevin-murgas/DNAmethylation-hierarchicalmodel. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Full Text

Duke Authors

Cited Authors

  • Murgas, KA; Ma, Y; Shahidi, LK; Mukherjee, S; Allen, AS; Shibata, D; Ryser, MD

Published Date

  • September 6, 2021

Published In

PubMed ID

  • 34487148

Electronic International Standard Serial Number (EISSN)

  • 1367-4811

Digital Object Identifier (DOI)

  • 10.1093/bioinformatics/btab637


  • eng

Conference Location

  • England