Skip to main content

Improving image caption performance with linguistic context

Publication ,  Chapter
Cao, Y; Wang, QF; Huang, K; Zhang, R
January 1, 2020

Image caption aims to generate a description of an image by using techniques of computer vision and natural language processing, where the framework of Convolutional Neural Networks (CNN) followed by Recurrent Neural Networks (RNN) or particularly LSTM, is widely used. In recent years, the attention-based CNN-LSTM networks attain the significant progress due to their ability of modelling global context. However, CNN-LSTMs do not consider the linguistic context explicitly, which is very useful in further boosting the performance. To overcome this issue, we proposed a method that integrate a n-gram model in the attention-based image caption framework, managing to model the word transition probability in the decoding process for enhancing the linguistic context of translation results. We evaluated the performance of BLEU on the benchmark dataset of MSCOCO 2014. Experimental results show the effectiveness of the proposed method. Specifically, the performance of BLEU-1, BLEU-2, BLEU-3 BLEU-4, and METEOR is improved by 0.2%, 0.7%, 0.6%, 0.5%, and 0.1, respectively.

Duke Scholars

DOI

Publication Date

January 1, 2020

Volume

11691 LNAI

Start / End Page

3 / 11

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 46 Information and computing sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cao, Y., Wang, Q. F., Huang, K., & Zhang, R. (2020). Improving image caption performance with linguistic context (Vol. 11691 LNAI, pp. 3–11). https://doi.org/10.1007/978-3-030-39431-8_1
Cao, Y., Q. F. Wang, K. Huang, and R. Zhang. “Improving image caption performance with linguistic context,” 11691 LNAI:3–11, 2020. https://doi.org/10.1007/978-3-030-39431-8_1.
Cao Y, Wang QF, Huang K, Zhang R. Improving image caption performance with linguistic context. In 2020. p. 3–11.
Cao, Y., et al. Improving image caption performance with linguistic context. Vol. 11691 LNAI, 2020, pp. 3–11. Scopus, doi:10.1007/978-3-030-39431-8_1.
Cao Y, Wang QF, Huang K, Zhang R. Improving image caption performance with linguistic context. 2020. p. 3–11.

DOI

Publication Date

January 1, 2020

Volume

11691 LNAI

Start / End Page

3 / 11

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 46 Information and computing sciences