Skip to main content
Journal cover image

Optimized mixed Markov models for motif identification.

Publication ,  Journal Article
Huang, W; Umbach, DM; Ohler, U; Li, L
Published in: BMC Bioinformatics
June 2, 2006

BACKGROUND: Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. RESULTS: We introduce a novel and flexible model, the Optimized Mixture Markov model (OMiMa), and related methods to allow adjustment of model complexity for different motifs. In comparison with other leading methods, OMiMa can incorporate more than the NNSplice's pairwise dependencies; OMiMa avoids model over-fitting better than the Permuted Variable Length Markov Model (PVLMM); and OMiMa requires smaller training samples than the Maximum Entropy Model (MEM). Testing on both simulated and actual data (regulatory cis-elements and splice sites), we found OMiMa's performance superior to the other leading methods in terms of prediction accuracy, required size of training data or computational time. Our OMiMa system, to our knowledge, is the only motif finding tool that incorporates automatic selection of the best model. OMiMa is freely available at 1. CONCLUSION: Our optimized mixture of Markov models represents an alternative to the existing methods for modeling dependent structures within a biological motif. Our model is conceptually simple and effective, and can improve prediction accuracy and/or computational speed over other leading methods.

Duke Scholars

Published In

BMC Bioinformatics

DOI

EISSN

1471-2105

Publication Date

June 2, 2006

Volume

7

Start / End Page

279

Location

England

Related Subject Headings

  • Software
  • Sequence Analysis, Protein
  • ROC Curve
  • Proteomics
  • Proteins
  • Protein Binding
  • Programming Languages
  • Pattern Recognition, Automated
  • Models, Theoretical
  • Models, Statistical
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Huang, W., Umbach, D. M., Ohler, U., & Li, L. (2006). Optimized mixed Markov models for motif identification. BMC Bioinformatics, 7, 279. https://doi.org/10.1186/1471-2105-7-279
Huang, Weichun, David M. Umbach, Uwe Ohler, and Leping Li. “Optimized mixed Markov models for motif identification.BMC Bioinformatics 7 (June 2, 2006): 279. https://doi.org/10.1186/1471-2105-7-279.
Huang W, Umbach DM, Ohler U, Li L. Optimized mixed Markov models for motif identification. BMC Bioinformatics. 2006 Jun 2;7:279.
Huang, Weichun, et al. “Optimized mixed Markov models for motif identification.BMC Bioinformatics, vol. 7, June 2006, p. 279. Pubmed, doi:10.1186/1471-2105-7-279.
Huang W, Umbach DM, Ohler U, Li L. Optimized mixed Markov models for motif identification. BMC Bioinformatics. 2006 Jun 2;7:279.
Journal cover image

Published In

BMC Bioinformatics

DOI

EISSN

1471-2105

Publication Date

June 2, 2006

Volume

7

Start / End Page

279

Location

England

Related Subject Headings

  • Software
  • Sequence Analysis, Protein
  • ROC Curve
  • Proteomics
  • Proteins
  • Protein Binding
  • Programming Languages
  • Pattern Recognition, Automated
  • Models, Theoretical
  • Models, Statistical