RoboCOP: jointly computing chromatin occupancy profiles for numerous factors from chromatin accessibility data.

Journal Article (Journal Article)

Chromatin is a tightly packaged structure of DNA and protein within the nucleus of a cell. The arrangement of different protein complexes along the DNA modulates and is modulated by gene expression. Measuring the binding locations and occupancy levels of different transcription factors (TFs) and nucleosomes is therefore crucial to understanding gene regulation. Antibody-based methods for assaying chromatin occupancy are capable of identifying the binding sites of specific DNA binding factors, but only one factor at a time. In contrast, epigenomic accessibility data like MNase-seq, DNase-seq, and ATAC-seq provide insight into the chromatin landscape of all factors bound along the genome, but with little insight into the identities of those factors. Here, we present RoboCOP, a multivariate state space model that integrates chromatin accessibility data with nucleotide sequence to jointly compute genome-wide probabilistic scores of nucleosome and TF occupancy, for hundreds of different factors. We apply RoboCOP to MNase-seq and ATAC-seq data to elucidate the protein-binding landscape of nucleosomes and 150 TFs across the yeast genome, and show that our model makes better predictions than existing methods. We also compute a chromatin occupancy profile of the yeast genome under cadmium stress, revealing chromatin dynamics associated with transcriptional regulation.

Full Text

Duke Authors

Cited Authors

  • Mitra, S; Zhong, J; Tran, TQ; MacAlpine, DM; Hartemink, AJ

Published Date

  • August 2021

Published In

Volume / Issue

  • 49 / 14

Start / End Page

  • 7925 - 7938

PubMed ID

  • 34255854

Pubmed Central ID

  • PMC8373080

Electronic International Standard Serial Number (EISSN)

  • 1362-4962

International Standard Serial Number (ISSN)

  • 0305-1048

Digital Object Identifier (DOI)

  • 10.1093/nar/gkab553

Language

  • eng