Skip to main content

Sequence features of DNA binding sites reveal structural class of associated transcription factor.

Publication ,  Journal Article
Narlikar, L; Hartemink, AJ
Published in: Bioinformatics (Oxford, England)
January 2006

A key goal in molecular biology is to understand the mechanisms by which a cell regulates the transcription of its genes. One important aspect of this transcriptional regulation is the binding of transcription factors (TFs) to their specific cis-regulatory counterparts on the DNA. TFs recognize and bind their DNA counterparts according to the structure of their DNA-binding domains (e.g. zinc finger, leucine zipper, homeodomain). The structure of these domains can be used as a basis for grouping TFs into classes. Although the structure of DNA-binding domains varies widely across TFs generally, the TFs within a particular class bind to DNA in a similar fashion, suggesting the existence of class-specific features in the DNA sequences bound by each class of TFs.In this paper, we apply a sparse Bayesian learning algorithm to identify a small set of class-specific features in the DNA sequences bound by different classes of TFs; the algorithm simultaneously learns a true multi-class classifier that uses these features to predict the DNA-binding domain of the TF that recognizes a particular set of DNA sequences. We train our algorithm on the six largest classes in TRANSFAC, comprising a total of 587 TFs. We learn a six-class classifier for this training set that achieves 87% leave-one-out cross-validation accuracy. We also identify features within cis-regulatory sequences that are highly specific to each class of TF, which has significant implications for how TF binding sites should be modeled for the purpose of motif discovery.

Duke Scholars

Published In

Bioinformatics (Oxford, England)

DOI

EISSN

1367-4811

ISSN

1367-4803

Publication Date

January 2006

Volume

22

Issue

2

Start / End Page

157 / 163

Related Subject Headings

  • Transcription Factors
  • Structure-Activity Relationship
  • Sequence Homology, Nucleic Acid
  • Sequence Analysis, DNA
  • Sequence Alignment
  • Protein Binding
  • Pattern Recognition, Automated
  • Molecular Sequence Data
  • DNA
  • Conserved Sequence
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Narlikar, L., & Hartemink, A. J. (2006). Sequence features of DNA binding sites reveal structural class of associated transcription factor. Bioinformatics (Oxford, England), 22(2), 157–163. https://doi.org/10.1093/bioinformatics/bti731
Narlikar, Leelavati, and Alexander J. Hartemink. “Sequence features of DNA binding sites reveal structural class of associated transcription factor.Bioinformatics (Oxford, England) 22, no. 2 (January 2006): 157–63. https://doi.org/10.1093/bioinformatics/bti731.
Narlikar L, Hartemink AJ. Sequence features of DNA binding sites reveal structural class of associated transcription factor. Bioinformatics (Oxford, England). 2006 Jan;22(2):157–63.
Narlikar, Leelavati, and Alexander J. Hartemink. “Sequence features of DNA binding sites reveal structural class of associated transcription factor.Bioinformatics (Oxford, England), vol. 22, no. 2, Jan. 2006, pp. 157–63. Epmc, doi:10.1093/bioinformatics/bti731.
Narlikar L, Hartemink AJ. Sequence features of DNA binding sites reveal structural class of associated transcription factor. Bioinformatics (Oxford, England). 2006 Jan;22(2):157–163.

Published In

Bioinformatics (Oxford, England)

DOI

EISSN

1367-4811

ISSN

1367-4803

Publication Date

January 2006

Volume

22

Issue

2

Start / End Page

157 / 163

Related Subject Headings

  • Transcription Factors
  • Structure-Activity Relationship
  • Sequence Homology, Nucleic Acid
  • Sequence Analysis, DNA
  • Sequence Alignment
  • Protein Binding
  • Pattern Recognition, Automated
  • Molecular Sequence Data
  • DNA
  • Conserved Sequence