Skip to main content

Towards Boosting the Open-Domain Chatbot with Human Feedback

Publication ,  Conference
Lu, H; Bao, S; He, H; Wang, F; Wu, H; Wang, H
Published in: Proceedings of the Annual Meeting of the Association for Computational Linguistics
January 1, 2023

Many open-domain dialogue models pre-trained with social media comments can generate coherent replies but have difficulties producing engaging responses. This phenomenon might mainly result from the deficiency of annotated human-human conversations and the misalignment with human preference. In this paper, we propose a novel and efficient framework Diamante to boost the open-domain chatbot, where two kinds of human feedback (including explicit demonstration and implicit preference) are collected and leveraged. By asking annotators to select or amend the model-generated candidate responses, Diamante efficiently collects the human demonstrated responses and constructs a Chinese chit-chat dataset. To enhance the alignment with human preference, Diamante leverages the implicit preference in the data collection process and introduces the generation-evaluation joint training. Comprehensive experiments indicate that the Diamante dataset and joint training paradigm can significantly boost the performance of pre-trained dialogue models. The overall engagingness of the previous state-of-the-art model has been improved remarkably by 50% in Chinese open-domain conversations.

Duke Scholars

Published In

Proceedings of the Annual Meeting of the Association for Computational Linguistics

DOI

ISSN

0736-587X

Publication Date

January 1, 2023

Volume

1

Start / End Page

4060 / 4078
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Lu, H., Bao, S., He, H., Wang, F., Wu, H., & Wang, H. (2023). Towards Boosting the Open-Domain Chatbot with Human Feedback. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 4060–4078). https://doi.org/10.18653/v1/2023.acl-long.224
Lu, H., S. Bao, H. He, F. Wang, H. Wu, and H. Wang. “Towards Boosting the Open-Domain Chatbot with Human Feedback.” In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 1:4060–78, 2023. https://doi.org/10.18653/v1/2023.acl-long.224.
Lu H, Bao S, He H, Wang F, Wu H, Wang H. Towards Boosting the Open-Domain Chatbot with Human Feedback. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2023. p. 4060–78.
Lu, H., et al. “Towards Boosting the Open-Domain Chatbot with Human Feedback.” Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 1, 2023, pp. 4060–78. Scopus, doi:10.18653/v1/2023.acl-long.224.
Lu H, Bao S, He H, Wang F, Wu H, Wang H. Towards Boosting the Open-Domain Chatbot with Human Feedback. Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2023. p. 4060–4078.

Published In

Proceedings of the Annual Meeting of the Association for Computational Linguistics

DOI

ISSN

0736-587X

Publication Date

January 1, 2023

Volume

1

Start / End Page

4060 / 4078