An SVM-based high-accurate recognition approach for handwritten numerals by using difference features
Handwritten numeral recognition is an important pattern recognition task. It can be widely used in various domains, e.g., bank money recognition, which requires a very high recognition rate. As a state-of-the-art classifier, Support Vector Machine (SVM), has been extensively used in this area. Typically, SVM is trained in a batch model, i.e., all data points are simultaneously input for training the classification boundary. However, some slightly exceptional data, only accounting for a small proportion, are critical for the recognition rates. Training a classifier among all the data may possibly treat such legal but slightly exceptional samples as "noise". In this paper, we propose a novel approach to attack this problem. This approach exploits a two-stage framework by using difference features. In the first stage, a regular SVM is trained on all the training data; in the second stage, only the samples misclassified in the first stage are specially considered. Therefore, the performance can be lifted. The number of misclassifications is often small because of the good performance of SVM. This will present difficulties in training an accurate SVM engine only for these misclassified samples. We then further propose a multi-way to binary approach using difference features. This approach successfully transforms multi-category classification to binary classification and expands the training samples greatly. In order to evaluate the proposed method, experiments are performed on 10, 000 handwritten numeral samples extracted from real banks forms. This new algorithm achieves 99.0% accuracy. In comparison, the traditional SVM only gets 98.4%.