Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge
Publication
, Conference
Cheng, M; Su, F; Li, C; Liu, J; Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
January 1, 2025
This paper describes the speaker diarization system developed for the Multimodal Information-Based Speech Processing (MISP) 2025 Challenge. First, we utilize the Sequence-to-Sequence Neural Diarization (S2SND) framework to generate initial predictions using single-channel audio. Then, we extend the original S2SND framework to create a new version, Multi-Channel Sequence-to-Sequence Neural Diarization (MCS2SND), which refines the initial results using multi-channel audio. The final system achieves a diarization error rate (DER) of 8.09% on the evaluation set of the competition database, ranking first place in the speaker diarization task of the MISP 2025 Challenge.
Duke Scholars
Published In
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
DOI
EISSN
2958-1796
ISSN
2308-457X
Publication Date
January 1, 2025
Start / End Page
1898 / 1902
Citation
APA
Chicago
ICMJE
MLA
NLM
Cheng, M., Su, F., Li, C., Liu, J., & Li, M. (2025). Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech (pp. 1898–1902). https://doi.org/10.21437/Interspeech.2025-1262
Cheng, M., F. Su, C. Li, J. Liu, and M. Li. “Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge.” In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 1898–1902, 2025. https://doi.org/10.21437/Interspeech.2025-1262.
Cheng M, Su F, Li C, Liu J, Li M. Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge. In: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2025. p. 1898–902.
Cheng, M., et al. “Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge.” Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2025, pp. 1898–902. Scopus, doi:10.21437/Interspeech.2025-1262.
Cheng M, Su F, Li C, Liu J, Li M. Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge. Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2025. p. 1898–1902.
Published In
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
DOI
EISSN
2958-1796
ISSN
2308-457X
Publication Date
January 1, 2025
Start / End Page
1898 / 1902