Mix-Up Augmentation for Oracle Character Recognition with Imbalanced Data Distribution
Oracle bone characters are probably the oldest hieroglyphs in China. It is of significant impact to recognize such characters since they can provide important clues for Chinese archaeology and philology. Automatic oracle bone character recognition however remains to be a challenging problem. In particular, due to the inherited nature, oracle characters are typically very limited and also seriously imbalanced in most available oracle datasets, which greatly hinders the research in automatic oracle bone character recognition. To alleviate this problem, we propose to design the mix-up strategy that leverages information from both majority and minority classes to augment samples of minority classes such that their boundaries can be pushed away towards majority classes. As a result, the training bias resulted from majority classes can be largely reduced. In addition, we consolidate our new framework with both the softmax loss and triplet loss on the augmented samples which proves able to improve the classification accuracy further. We conduct extensive evaluations w.r.t. both total class accuracy and average class accuracy on three benchmark datasets (i.e., Oracle-20K, Oracle-AYNU and OBC306). Experimental results show that the proposed method can result in superior performance to the comparison approaches, attaining a new state of the art in oracle bone character recognition.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences