Scholars@Duke publication: Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

Publication , Conference

Zhang, J; Chen, Y; Chen, J

Published in: IEEE International Conference on Data Mining Workshops, ICDMW

January 1, 2022

Developing neural architectures that are capable of logical reasoning has become increasingly important for a wide range of applications (e.g., natural language processing). Towards this grand objective, we propose a symbolic reasoning architecture that chains many join operators together to model output logical expressions. In particular, we demonstrate that such an ensemble of join chains can express a broad subset of 'tree-structured' first-order logical expressions, named FOET which is particularly useful for modeling natural languages. To endow it with differentiable learning capability, we closely examine various neural operators for approximating the symbolic join-chains. Interestingly, we find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space. Our analysis not only provides a new perspective on the mechanism of the pretrained models such as BERT for natural language understanding, but also suggests several important future improvement directions.

Duke Scholars

Author Yiran Chen Electrical and Computer Engineering

Published In

IEEE International Conference on Data Mining Workshops, ICDMW

DOI

10.1109/ICDMW58026.2022.00123

EISSN

2375-9259

ISSN

2375-9232

Publication Date

January 1, 2022

Volume

2022-November

Start / End Page

947 / 957

Citation

APA

Chicago

ICMJE

MLA

NLM

Zhang, J., Chen, Y., & Chen, J. (2022). Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer. In IEEE International Conference on Data Mining Workshops, ICDMW (Vol. 2022-November, pp. 947–957). https://doi.org/10.1109/ICDMW58026.2022.00123

Zhang, J., Y. Chen, and J. Chen. “Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer.” In IEEE International Conference on Data Mining Workshops, ICDMW, 2022-November:947–57, 2022. https://doi.org/10.1109/ICDMW58026.2022.00123.

Zhang J, Chen Y, Chen J. Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer. In: IEEE International Conference on Data Mining Workshops, ICDMW. 2022. p. 947–57.

Zhang, J., et al. “Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer.” IEEE International Conference on Data Mining Workshops, ICDMW, vol. 2022-November, 2022, pp. 947–57. Scopus, doi:10.1109/ICDMW58026.2022.00123.

Zhang J, Chen Y, Chen J. Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer. IEEE International Conference on Data Mining Workshops, ICDMW. 2022. p. 947–957.

Published In

IEEE International Conference on Data Mining Workshops, ICDMW

DOI

10.1109/ICDMW58026.2022.00123

EISSN

2375-9259

ISSN

2375-9232

Publication Date

January 1, 2022

Volume

2022-November

Start / End Page

947 / 957