Scholars@Duke publication: Residual attention-based multi-scale script identification in scene text images

Residual attention-based multi-scale script identification in scene text images

Publication , Journal Article

Ma, M; Wang, QF; Huang, S; Goulermas, Y; Huang, K

Published in: Neurocomputing

January 15, 2021

Script identification is an essential step in the text extraction pipeline for multi-lingual application. This paper presents an effective approach to identify scripts in scene text images. Due to the complicated background, various text styles, character similarity of different languages, script identification has not been solved yet. Under the general classification framework of script identification, we investigate two important components: feature extraction and classification layer. In the feature extraction, we utilize a hierarchical feature fusion block to extract the multi-scale features. Furthermore, we adopt an attention mechanism to obtain the local discriminative parts of feature maps. In the classification layer, we utilize a fully convolutional classifier to generate channel-level classifications which are then processed by a global pooling layer to improve classification efficiency. We evaluated the proposed approach on benchmark datasets of RRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, and the experimental results show the effectiveness of each elaborate designed component. Finally, we achieve better performances than those competitive models, where the correct rates are 89.66%, 96.11%, 98.78% and 97.20% on PRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, respectively.

Duke Scholars

Author Kaizhu Huang DKU Faculty

Published In

Neurocomputing

DOI

10.1016/j.neucom.2020.09.015

EISSN

1872-8286

ISSN

0925-2312

Publication Date

January 15, 2021

Volume

421

Start / End Page

222 / 233

Related Subject Headings

Artificial Intelligence & Image Processing
52 Psychology
46 Information and computing sciences
40 Engineering
17 Psychology and Cognitive Sciences
09 Engineering
08 Information and Computing Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Ma, M., Wang, Q. F., Huang, S., Goulermas, Y., & Huang, K. (2021). Residual attention-based multi-scale script identification in scene text images. Neurocomputing, 421, 222–233. https://doi.org/10.1016/j.neucom.2020.09.015

Ma, M., Q. F. Wang, S. Huang, Y. Goulermas, and K. Huang. “Residual attention-based multi-scale script identification in scene text images.” Neurocomputing 421 (January 15, 2021): 222–33. https://doi.org/10.1016/j.neucom.2020.09.015.

Ma M, Wang QF, Huang S, Goulermas Y, Huang K. Residual attention-based multi-scale script identification in scene text images. Neurocomputing. 2021 Jan 15;421:222–33.

Ma, M., et al. “Residual attention-based multi-scale script identification in scene text images.” Neurocomputing, vol. 421, Jan. 2021, pp. 222–33. Scopus, doi:10.1016/j.neucom.2020.09.015.

Ma M, Wang QF, Huang S, Goulermas Y, Huang K. Residual attention-based multi-scale script identification in scene text images. Neurocomputing. 2021 Jan 15;421:222–233.

Published In

Neurocomputing

DOI

10.1016/j.neucom.2020.09.015

EISSN

1872-8286

ISSN

0925-2312

Publication Date

January 15, 2021

Volume

421

Start / End Page

222 / 233

Related Subject Headings

Artificial Intelligence & Image Processing
52 Psychology
46 Information and computing sciences
40 Engineering
17 Psychology and Cognitive Sciences
09 Engineering
08 Information and Computing Sciences