Scholars@Duke publication: Improving image caption performance with linguistic context

Improving image caption performance with linguistic context

Publication , Chapter

Cao, Y; Wang, QF; Huang, K; Zhang, R

January 1, 2020

Image caption aims to generate a description of an image by using techniques of computer vision and natural language processing, where the framework of Convolutional Neural Networks (CNN) followed by Recurrent Neural Networks (RNN) or particularly LSTM, is widely used. In recent years, the attention-based CNN-LSTM networks attain the significant progress due to their ability of modelling global context. However, CNN-LSTMs do not consider the linguistic context explicitly, which is very useful in further boosting the performance. To overcome this issue, we proposed a method that integrate a n-gram model in the attention-based image caption framework, managing to model the word transition probability in the decoding process for enhancing the linguistic context of translation results. We evaluated the performance of BLEU on the benchmark dataset of MSCOCO 2014. Experimental results show the effectiveness of the proposed method. Specifically, the performance of BLEU-1, BLEU-2, BLEU-3 BLEU-4, and METEOR is improved by 0.2%, 0.7%, 0.6%, 0.5%, and 0.1, respectively.

Duke Scholars

Author Kaizhu Huang DKU Faculty

DOI

10.1007/978-3-030-39431-8_1

Publication Date

January 1, 2020

Volume

11691 LNAI

Start / End Page

3 / 11

Related Subject Headings

Artificial Intelligence & Image Processing
46 Information and computing sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Cao, Y., Wang, Q. F., Huang, K., & Zhang, R. (2020). Improving image caption performance with linguistic context (Vol. 11691 LNAI, pp. 3–11). https://doi.org/10.1007/978-3-030-39431-8_1

Cao, Y., Q. F. Wang, K. Huang, and R. Zhang. “Improving image caption performance with linguistic context,” 11691 LNAI:3–11, 2020. https://doi.org/10.1007/978-3-030-39431-8_1.

Cao Y, Wang QF, Huang K, Zhang R. Improving image caption performance with linguistic context. In 2020. p. 3–11.

Cao, Y., et al. Improving image caption performance with linguistic context. Vol. 11691 LNAI, 2020, pp. 3–11. Scopus, doi:10.1007/978-3-030-39431-8_1.

Cao Y, Wang QF, Huang K, Zhang R. Improving image caption performance with linguistic context. 2020. p. 3–11.

DOI

10.1007/978-3-030-39431-8_1

Publication Date

January 1, 2020

Volume

11691 LNAI

Start / End Page

3 / 11

Related Subject Headings

Artificial Intelligence & Image Processing
46 Information and computing sciences