The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023
Publication
, Conference
Cheng, M; Wang, W; Qin, X; Lin, Y; Jiang, N; Zhao, G; Li, M
Published in: Communications in Computer and Information Science
January 1, 2024
This paper describes the DKU-MSXF submission to track 4 of the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23). Our system pipeline contains voice activity detection, clustering-based diarization, overlapped speech detection, and target-speaker voice activity detection, where each procedure has a fused output from 3 sub-models. Finally, we fuse different clustering-based and TSVAD-based diarization systems using DOVER-Lap and achieve the 4.30% diarization error rate (DER), which ranks first place on track 4 of the challenge leaderboard.
Duke Scholars
Published In
Communications in Computer and Information Science
DOI
EISSN
1865-0937
ISSN
1865-0929
Publication Date
January 1, 2024
Volume
2006
Start / End Page
330 / 337
Citation
APA
Chicago
ICMJE
MLA
NLM
Cheng, M., Wang, W., Qin, X., Lin, Y., Jiang, N., Zhao, G., & Li, M. (2024). The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023. In Communications in Computer and Information Science (Vol. 2006, pp. 330–337). https://doi.org/10.1007/978-981-97-0601-3_28
Cheng, M., W. Wang, X. Qin, Y. Lin, N. Jiang, G. Zhao, and M. Li. “The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023.” In Communications in Computer and Information Science, 2006:330–37, 2024. https://doi.org/10.1007/978-981-97-0601-3_28.
Cheng M, Wang W, Qin X, Lin Y, Jiang N, Zhao G, et al. The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023. In: Communications in Computer and Information Science. 2024. p. 330–7.
Cheng, M., et al. “The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023.” Communications in Computer and Information Science, vol. 2006, 2024, pp. 330–37. Scopus, doi:10.1007/978-981-97-0601-3_28.
Cheng M, Wang W, Qin X, Lin Y, Jiang N, Zhao G, Li M. The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023. Communications in Computer and Information Science. 2024. p. 330–337.
Published In
Communications in Computer and Information Science
DOI
EISSN
1865-0937
ISSN
1865-0929
Publication Date
January 1, 2024
Volume
2006
Start / End Page
330 / 337