SVM-based prediction of linear B-cell epitopes using Bayes Feature Extraction.

Journal Article (Journal Article)

Background

The identification of B-cell epitopes on antigens has been a subject of intense research as the knowledge of these markers has great implications for the development of peptide-based diagnostics, therapeutics and vaccines. As experimental approaches are often laborious and time consuming, in silico methods for prediction of these immunogenic regions are critical. Such efforts, however, have been significantly hindered by high variability in the length and composition of the epitope sequences, making naïve modeling methods difficult to apply.

Results

We analyzed two benchmark datasets and found that linear B-cell epitopes possess distinctive residue conservation and position-specific residue propensities which could be exploited for epitope discrimination in silico. We developed a support vector machines (SVM) prediction model employing Bayes Feature Extraction to predict linear B-cell epitopes of diverse lengths (12- to 20-mers). The best SVM classifier achieved an accuracy of 74.50% and AROC of 0.84 on an independent test set and was shown to outperform existing linear B-cell epitope prediction algorithms. In addition, we applied our model to a dataset of antigenic proteins with experimentally-verified epitopes and found it to be generally effective for discriminating the epitopes from non-epitopes.

Conclusion

We developed a SVM prediction model utilizing Bayes Feature Extraction and showed that it was effective in discriminating epitopes from non-epitopes in benchmark datasets and annotated antigenic proteins. A web server for predicting linear B-cell epitopes was developed and is available, together with supplementary materials, at http://www.immunopred.org/bayesb/index.html.

Full Text

Duke Authors

Cited Authors

  • Wee, LJK; Simarmata, D; Kam, Y-W; Ng, LFP; Tong, JC

Published Date

  • December 2010

Published In

Volume / Issue

  • 11 Suppl 4 /

Start / End Page

  • S21 -

PubMed ID

  • 21143805

Pubmed Central ID

  • PMC3005920

Electronic International Standard Serial Number (EISSN)

  • 1471-2164

International Standard Serial Number (ISSN)

  • 1471-2164

Digital Object Identifier (DOI)

  • 10.1186/1471-2164-11-s4-s21

Language

  • eng