Skip to main content

The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis

Publication ,  Conference
Wang, H; Cheng, M; Fu, Q; Li, M
Published in: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
January 1, 2023

This paper further explores our previous wake word spotting system ranked 2-nd in Track 1 of the MISP Challenge 2021. First, we investigate a robust unimodal approach based on 3D and 2D convolution and adopt the simple attention module (SimAM) for our system to improve performance. Second, we explore different combinations of data augmentation methods for better performance. Finally, we study the fusion strategies, including score-level, cascaded and neural fusion. Our proposed multimodal system leverages multimodal features and uses the complementary visual information to mitigate the performance degradation of audio-only systems in complex acoustic scenarios. Our system obtains a false reject rate of 2.15% and a false alarm rate of 3.44% in the evaluation set of the competition database, which achieves the new state-of-the-art performance by 21% relative improvement compared to previous systems. Related resource can be found at: https://github.com/Mashiro009/DKU-WWS-MISP.

Duke Scholars

Published In

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

DOI

ISSN

1520-6149

Publication Date

January 1, 2023

Volume

2023-June
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, H., Cheng, M., Fu, Q., & Li, M. (2023). The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 2023-June). https://doi.org/10.1109/ICASSP49357.2023.10095459
Wang, H., M. Cheng, Q. Fu, and M. Li. “The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis.” In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Vol. 2023-June, 2023. https://doi.org/10.1109/ICASSP49357.2023.10095459.
Wang H, Cheng M, Fu Q, Li M. The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2023.
Wang, H., et al. “The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis.” ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2023-June, 2023. Scopus, doi:10.1109/ICASSP49357.2023.10095459.
Wang H, Cheng M, Fu Q, Li M. The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2023.

Published In

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

DOI

ISSN

1520-6149

Publication Date

January 1, 2023

Volume

2023-June