Scholars@Duke publication: The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge

The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge

Publication , Conference

Wang, H; Li, C; Su, F; Liu, J; Suo, H; Li, M

Published in: 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024

January 1, 2024

The paper describes the Wake Word Lipreading system developed by the WHU team for the ChatCLR Challenge 2024. Although Lipreading and Wake Word Spotting have seen significant development, exploration of pretrained frontends for Wake Word Lipreading (WWL) remains insufficient. Our system is built upon a pretrained frontend and Transformer-liked backend architecture, incorporating Attentive Pooling and a Classifier. We investigate the effectiveness of different frontends, including Auto-AVSR and AV-Hubert, and evaluate the performance of Conformer and E-Branchformer backends. Additionally, we introduce Multi-layer Feature Aggregation to leverage features from multiple encoder block layers, demonstrating its effectiveness. Finally, we apply various fusion strategies, leading to score fusion that achieved a false reject rate of 8.21% and a false alarm rate of 8.50% along with a WWS score of 16.71% on the evaluation set, and obtain the first place in the task 1 of the ChatCLR Challenge.

Duke Scholars

Author Ming Li DKU Faculty

Published In

2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024

DOI

10.1109/ICMEW63481.2024.10645425

Publication Date

January 1, 2024

Citation

APA

Chicago

ICMJE

MLA

NLM

Wang, H., Li, C., Su, F., Liu, J., Suo, H., & Li, M. (2024). The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge. In 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024. https://doi.org/10.1109/ICMEW63481.2024.10645425

Wang, H., C. Li, F. Su, J. Liu, H. Suo, and M. Li. “The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge.” In 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024, 2024. https://doi.org/10.1109/ICMEW63481.2024.10645425.

Wang H, Li C, Su F, Liu J, Suo H, Li M. The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge. In: 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024. 2024.

Wang, H., et al. “The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge.” 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024, 2024. Scopus, doi:10.1109/ICMEW63481.2024.10645425.

Wang H, Li C, Su F, Liu J, Suo H, Li M. The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge. 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024. 2024.

Published In

2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024

DOI

10.1109/ICMEW63481.2024.10645425

Publication Date

January 1, 2024