Scholars@Duke publication: Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms

Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms

Publication , Journal Article

Shen, D; Wang, G; Wang, W; Min, MR; Su, Q; Zhang, Y; Li, C; Henao, R; Carin, L

Published in: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

January 1, 2018

Published version (DOI)

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging.

Duke Scholars

Author Ricardo Henao Biostatistics & Bioinformatics, Division of Translational Bi ...

Author Lawrence Carin Electrical and Computer Engineering

Published In

ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

DOI

10.18653/v1/p18-1041

Publication Date

January 1, 2018

Volume

Start / End Page

440 / 450

Citation

APA

Chicago

ICMJE

MLA

NLM

Shen, D., Wang, G., Wang, W., Min, M. R., Su, Q., Zhang, Y., … Carin, L. (2018). Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), 1, 440–450. https://doi.org/10.18653/v1/p18-1041

Shen, D., G. Wang, W. Wang, M. R. Min, Q. Su, Y. Zhang, C. Li, R. Henao, and L. Carin. “Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms.” ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1 (January 1, 2018): 440–50. https://doi.org/10.18653/v1/p18-1041.

Shen D, Wang G, Wang W, Min MR, Su Q, Zhang Y, et al. Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). 2018 Jan 1;1:440–50.

Shen, D., et al. “Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms.” ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, Jan. 2018, pp. 440–50. Scopus, doi:10.18653/v1/p18-1041.

Shen D, Wang G, Wang W, Min MR, Su Q, Zhang Y, Li C, Henao R, Carin L. Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). 2018 Jan 1;1:440–450.

Published In

ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

DOI

10.18653/v1/p18-1041

Publication Date

January 1, 2018

Volume

Start / End Page

440 / 450