Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes.
Understanding transcription factor (TF) mediated control of gene expression remains a major challenge at the interface of computational and experimental biology. Computational techniques predicting TF-binding site specificity are frequently unreliable. On the other hand, comprehensive experimental validation is difficult and time consuming. We introduce a simple strategy that dramatically improves robustness and accuracy of computational binding site prediction. First, we evaluate the rate of recurrence of computational TFBS predictions by commonly used sampling procedures. We find that the vast majority of results are biologically meaningless. However clustering results based on nucleotide position improves predictive power. Additionally, we find that positional clustering increases robustness to long or imperfectly selected input sequences. Positional clustering can also be used as a mechanism to integrate results from multiple sampling approaches for improvements in accuracy over each one alone. Finally, we predict and validate regulatory sequences partially responsible for transcriptional control of the mammalian type A gamma-aminobutyric acid receptor (GABA(A)R) subunit genes. Positional clustering is useful for improving computational binding site predictions, with potential application to improving our understanding of mammalian gene expression. In particular, predicted regulatory mechanisms in the mammalian GABA(A)R subunit gene family may open new avenues of research towards understanding this pharmacologically important neurotransmitter receptor system.
Reddy, TE; Shakhnovich, BE; Roberts, DS; Russek, SJ; DeLisi, C
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)