Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes.

Published

Journal Article

Understanding transcription factor (TF) mediated control of gene expression remains a major challenge at the interface of computational and experimental biology. Computational techniques predicting TF-binding site specificity are frequently unreliable. On the other hand, comprehensive experimental validation is difficult and time consuming. We introduce a simple strategy that dramatically improves robustness and accuracy of computational binding site prediction. First, we evaluate the rate of recurrence of computational TFBS predictions by commonly used sampling procedures. We find that the vast majority of results are biologically meaningless. However clustering results based on nucleotide position improves predictive power. Additionally, we find that positional clustering increases robustness to long or imperfectly selected input sequences. Positional clustering can also be used as a mechanism to integrate results from multiple sampling approaches for improvements in accuracy over each one alone. Finally, we predict and validate regulatory sequences partially responsible for transcriptional control of the mammalian type A gamma-aminobutyric acid receptor (GABA(A)R) subunit genes. Positional clustering is useful for improving computational binding site predictions, with potential application to improving our understanding of mammalian gene expression. In particular, predicted regulatory mechanisms in the mammalian GABA(A)R subunit gene family may open new avenues of research towards understanding this pharmacologically important neurotransmitter receptor system.

Full Text

Duke Authors

Cited Authors

  • Reddy, TE; Shakhnovich, BE; Roberts, DS; Russek, SJ; DeLisi, C

Published Date

  • 2007

Published In

Volume / Issue

  • 35 / 3

Start / End Page

  • e20 -

PubMed ID

  • 17204484

Pubmed Central ID

  • 17204484

Electronic International Standard Serial Number (EISSN)

  • 1362-4962

Digital Object Identifier (DOI)

  • 10.1093/nar/gkl1062

Language

  • eng

Conference Location

  • England