Scholars@Duke publication: Syntactic Knowledge-Infused Transformer and BERT models

Syntactic Knowledge-Infused Transformer and BERT models

Publication , Conference

Sundararaman, D; Subramanian, V; Wang, G; Si, S; Shen, D; Wang, D; Carin, L

Published in: CEUR Workshop Proceedings

January 1, 2021

Attention-based deep learning models have demonstrated significant improvement over traditional algorithms in several NLP tasks. The Transformer, for instance, is an illustrative example that generates abstract representations of tokens that are input to an encoder based on their relationships to all tokens in a sequence. While recent studies have shown that such models are capable of learning syntactic features purely by seeing examples, we hypothesize that explicitly feeding this information to deep learning models can significantly enhance their performance in many cases. Leveraging syntactic information like part of speech (POS) may be particularly beneficial in limited-training-data settings for complex models such as the Transformer. In this paper, we verify this hypothesis by infusing syntactic knowledge into the Transformer. We find that this syntax-infused Transformer achieves an improvement of 0.7 BLEU when trained on the full WMT'14 English to German translation dataset and a maximum improvement of 1.99 BLEU points when trained on a fraction of the dataset. In addition, we find that the incorporation of syntax into BERT fine-tuning outperforms BERTBASE on all downstream tasks from the GLUE benchmark, including an improvement of 0.8% on CoLA.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

CEUR Workshop Proceedings

ISSN

1613-0073

Publication Date

January 1, 2021

Volume

3052

Related Subject Headings

4609 Information systems

Citation

APA

Chicago

ICMJE

MLA

NLM

Sundararaman, D., Subramanian, V., Wang, G., Si, S., Shen, D., Wang, D., & Carin, L. (2021). Syntactic Knowledge-Infused Transformer and BERT models. In CEUR Workshop Proceedings (Vol. 3052).

Sundararaman, D., V. Subramanian, G. Wang, S. Si, D. Shen, D. Wang, and L. Carin. “Syntactic Knowledge-Infused Transformer and BERT models.” In CEUR Workshop Proceedings, Vol. 3052, 2021.

Sundararaman D, Subramanian V, Wang G, Si S, Shen D, Wang D, et al. Syntactic Knowledge-Infused Transformer and BERT models. In: CEUR Workshop Proceedings. 2021.

Sundararaman, D., et al. “Syntactic Knowledge-Infused Transformer and BERT models.” CEUR Workshop Proceedings, vol. 3052, 2021.

Sundararaman D, Subramanian V, Wang G, Si S, Shen D, Wang D, Carin L. Syntactic Knowledge-Infused Transformer and BERT models. CEUR Workshop Proceedings. 2021.

Published In

CEUR Workshop Proceedings

ISSN

1613-0073

Publication Date

January 1, 2021

Volume

3052

Related Subject Headings

4609 Information systems