Boosting evolutionary support vector machine for designing tumor classifiers from microarray data
Since there are multiple sets of relevant genes having the same high accuracy in fitting training data called model uncertainty, to identify a small set of informative genes from microarray data for designing an accurate tumor classifier for unknown samples is intractable. Support vector machine (SVM), a supervised machine learning technique, is one of the methods successfully applied to cancer diagnosis problems. This study proposes an SVM-based classifier with automatic feature selection associated with a boosting strategy. The proposed boosting evolutionary support vector machine (named BESVM) hybridizes the advantages of SVM, boosting using a majority-voting ensemble and an intelligent genetic algorithm for gene selection. The merits of the BESVM-based classifier are threefold: 1) a small set of used genes, 2) accurate test classification using leave-one-out cross-validation, and 3) robust performance by avoiding overfitting training data. Five benchmark datasets were used to evaluate the BESVM-based classifier. Simulation results reveal that BESVM performs well having a mean accuracy 94.26% using only 10.1 genes averagely, compared with the existing SVM and non-SVM based classifiers. © 2007 IEEE.