Comprehensive evaluation of diverse massively parallel reporter assays to functionally characterize human enhancers genome-wide.
BACKGROUND: Massively parallel reporter assays (MPRAs) and self-transcribing active regulatory region sequencing (STARR-seq) have revolutionized enhancer characterization by enabling high-throughput functional assessment of regulatory sequences. RESULTS: Here, we systematically evaluate six MPRA and STARR-seq datasets generated in the human K562 cell line and find substantial inconsistencies in enhancer calls from different labs that are primarily due to technical variations in data processing and experimental workflows. To address these variations, we implement a uniform enhancer call pipeline, which significantly improve cross-assay agreement. While increasing sequence overlap thresholds enhanced concordance in STARR-seq assays, cross-assay consistency in LentiMPRA is strongly influenced by assay-specific factors. Functional validation using candidate cis-regulatory elements (cCREs) confirms that epigenomic features such as chromatin accessibility and histone modifications are strong predictors of enhancer activity. Importantly, our study validates transcription as a critical hallmark of active enhancers, demonstrating that highly transcribed regions exhibit significantly higher active rates across assays. Furthermore, we show that transcription enhances the predictive power of epigenomic features, enabling more accurate and refined enhancer annotation. CONCLUSIONS: Our study provides a comprehensive framework for integrating different enhancer datasets and underscores the importance of accounting for assay-specific biases when interpreting enhancer activity. These findings refine enhancer identification using massively parallel reporter assays and improve the functional annotation of the human genome.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- K562 Cells
- Humans
- High-Throughput Nucleotide Sequencing
- Genome, Human
- Genes, Reporter
- Epigenomics
- Enhancer Elements, Genetic
- Bioinformatics
- 08 Information and Computing Sciences
- 06 Biological Sciences
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- K562 Cells
- Humans
- High-Throughput Nucleotide Sequencing
- Genome, Human
- Genes, Reporter
- Epigenomics
- Enhancer Elements, Genetic
- Bioinformatics
- 08 Information and Computing Sciences
- 06 Biological Sciences