Scholars@Duke publication: Improving sequence-to-sequence learning via optimal transport

Improving sequence-to-sequence learning via optimal transport

Publication , Conference

Chen, L; Zhang, Y; Zhang, R; Tao, C; Gan, Z; Zhang, H; Li, B; Shen, D; Chen, C; Carin, L

Published in: 7th International Conference on Learning Representations, ICLR 2019

January 1, 2019

© 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved. Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE). However, standard MLE training considers a word-level objective, predicting the next word given the previous ground-truth partial sentence. This procedure focuses on modeling local syntactic patterns, and may fail to capture long-range semantic structure. We present a novel solution to alleviate these issues. Our approach imposes global sequence-level guidance via new supervision based on optimal transport, enabling the overall characterization and preservation of semantic features. We further show that this method can be understood as a Wasserstein gradient flow trying to match our model to the ground truth sequence distribution. Extensive experiments are conducted to validate the utility of the proposed approach, showing consistent improvements over a wide variety of NLP tasks, including machine translation, abstractive text summarization, and image captioning.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

7th International Conference on Learning Representations, ICLR 2019

Publication Date

January 1, 2019

Citation

APA

Chicago

ICMJE

MLA

NLM

Chen, L., Zhang, Y., Zhang, R., Tao, C., Gan, Z., Zhang, H., … Carin, L. (2019). Improving sequence-to-sequence learning via optimal transport. In 7th International Conference on Learning Representations, ICLR 2019.

Chen, L., Y. Zhang, R. Zhang, C. Tao, Z. Gan, H. Zhang, B. Li, D. Shen, C. Chen, and L. Carin. “Improving sequence-to-sequence learning via optimal transport.” In 7th International Conference on Learning Representations, ICLR 2019, 2019.

Chen L, Zhang Y, Zhang R, Tao C, Gan Z, Zhang H, et al. Improving sequence-to-sequence learning via optimal transport. In: 7th International Conference on Learning Representations, ICLR 2019. 2019.

Chen, L., et al. “Improving sequence-to-sequence learning via optimal transport.” 7th International Conference on Learning Representations, ICLR 2019, 2019.

Chen L, Zhang Y, Zhang R, Tao C, Gan Z, Zhang H, Li B, Shen D, Chen C, Carin L. Improving sequence-to-sequence learning via optimal transport. 7th International Conference on Learning Representations, ICLR 2019. 2019.

Published In

7th International Conference on Learning Representations, ICLR 2019

Publication Date

January 1, 2019