Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites.

Published

Journal Article

Analysis of the human genome sequence has identified approximately 25000-30000 protein-coding genes, but little is known about how most of these are regulated. Mapping DNase I hypersensitive (HS) sites has traditionally represented the gold-standard experimental method for identifying regulatory elements, but the labor-intensive nature of this technique has limited its application to only a small number of human genes. We have developed a protocol to generate a genome-wide library of gene regulatory sequences by cloning DNase HS sites. We generated a library of DNase HS sites from quiescent primary human CD4(+) T cells and analyzed approximately 5600 of the resulting clones. Compared to sequences from randomly generated in silico libraries, sequences from these clones were found to map more frequently to regions of the genome known to contain regulatory elements, such as regions upstream of genes, within CpG islands, and in sequences that align between mouse and human. These cloned sites also tend to map near genes that have detectable transcripts in CD4(+) T cells, demonstrating that transcriptionally active regions of the genome are being selected. Validation of putative regulatory elements was achieved by repeated recovery of the same sequence and real-time PCR. This cloning strategy, which can be scaled up and applied to any cell line or tissue, will be useful in identifying regulatory elements controlling global expression differences that delineate tissue types, stages of development, and disease susceptibility.

Full Text

Duke Authors

Cited Authors

  • Crawford, GE; Holt, IE; Mullikin, JC; Tai, D; Blakesley, R; Bouffard, G; Young, A; Masiello, C; Green, ED; Wolfsberg, TG; Collins, FS; National Institutes Of Health Intramural Sequencing Center,

Published Date

  • January 19, 2004

Published In

Volume / Issue

  • 101 / 4

Start / End Page

  • 992 - 997

PubMed ID

  • 14732688

Pubmed Central ID

  • 14732688

Electronic International Standard Serial Number (EISSN)

  • 1091-6490

International Standard Serial Number (ISSN)

  • 0027-8424

Digital Object Identifier (DOI)

  • 10.1073/pnas.0307540100

Language

  • eng