Skip to main content

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Publication ,  Conference
Bao, S; He, H; Wang, F; Wu, H; Wang, H; Wu, W; Wu, Z; Guo, Z; Lu, H; Huang, X; Tian, X; Xu, X; Lin, Y; Niu, ZY
Published in: 2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022
January 1, 2022

To explore the limit of dialogue generation pretraining, we present the models of PLATO-XLwith up to 11 billion parameters, trained onboth Chinese and English social media conversations. To train such large models, weadopt the architecture of unified transformerwith high computation and parameter efficiency.In addition, we carry out multi-party aware pretraining to better distinguish the characteristic information in social media conversations.With such designs, PLATO-XL successfullyachieves superior performances as comparedto other approaches in both Chinese and English chitchat. We further explore the capacityof PLATO-XL on other conversational tasks,such as knowledge grounded dialogue and taskoriented conversation. The experimental resultsindicate that PLATO-XL obtains state-of-theart results across multiple conversational tasks,verifying its potential as a foundation model ofconversational AI.

Duke Scholars

Published In

2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022

DOI

Publication Date

January 1, 2022

Start / End Page

107 / 118
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Bao, S., He, H., Wang, F., Wu, H., Wang, H., Wu, W., … Niu, Z. Y. (2022). PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation. In 2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022 (pp. 107–118). https://doi.org/10.18653/v1/2022.findings-aacl.10
Bao, S., H. He, F. Wang, H. Wu, H. Wang, W. Wu, Z. Wu, et al. “PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation.” In 2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022, 107–18, 2022. https://doi.org/10.18653/v1/2022.findings-aacl.10.
Bao S, He H, Wang F, Wu H, Wang H, Wu W, et al. PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation. In: 2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022. 2022. p. 107–18.
Bao, S., et al. “PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation.” 2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022, 2022, pp. 107–18. Scopus, doi:10.18653/v1/2022.findings-aacl.10.
Bao S, He H, Wang F, Wu H, Wang H, Wu W, Wu Z, Guo Z, Lu H, Huang X, Tian X, Xu X, Lin Y, Niu ZY. PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation. 2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022. 2022. p. 107–118.

Published In

2nd Conference of the Asia Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing Findings of the Association for Computational Linguistics Aacl Ijcnlp 2022

DOI

Publication Date

January 1, 2022

Start / End Page

107 / 118