Scholars@Duke publication: The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

Publication , Conference

Cheng, M; Wang, W; Qin, X; Lin, Y; Jiang, N; Zhao, G; Li, M

Published in: Communications in Computer and Information Science

January 1, 2024

This paper describes the DKU-MSXF submission to track 4 of the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23). Our system pipeline contains voice activity detection, clustering-based diarization, overlapped speech detection, and target-speaker voice activity detection, where each procedure has a fused output from 3 sub-models. Finally, we fuse different clustering-based and TSVAD-based diarization systems using DOVER-Lap and achieve the 4.30% diarization error rate (DER), which ranks first place on track 4 of the challenge leaderboard.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Communications in Computer and Information Science

DOI

10.1007/978-981-97-0601-3_28

EISSN

1865-0937

ISSN

1865-0929

Publication Date

January 1, 2024

Volume

2006

Start / End Page

330 / 337

Citation

APA

Chicago

ICMJE

MLA

NLM

Cheng, M., Wang, W., Qin, X., Lin, Y., Jiang, N., Zhao, G., & Li, M. (2024). The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023. In Communications in Computer and Information Science (Vol. 2006, pp. 330–337). https://doi.org/10.1007/978-981-97-0601-3_28

Cheng, M., W. Wang, X. Qin, Y. Lin, N. Jiang, G. Zhao, and M. Li. “The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023.” In Communications in Computer and Information Science, 2006:330–37, 2024. https://doi.org/10.1007/978-981-97-0601-3_28.

Cheng M, Wang W, Qin X, Lin Y, Jiang N, Zhao G, et al. The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023. In: Communications in Computer and Information Science. 2024. p. 330–7.

Cheng, M., et al. “The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023.” Communications in Computer and Information Science, vol. 2006, 2024, pp. 330–37. Scopus, doi:10.1007/978-981-97-0601-3_28.

Cheng M, Wang W, Qin X, Lin Y, Jiang N, Zhao G, Li M. The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023. Communications in Computer and Information Science. 2024. p. 330–337.

Published In

Communications in Computer and Information Science

DOI

10.1007/978-981-97-0601-3_28

EISSN

1865-0937

ISSN

1865-0929

Publication Date

January 1, 2024

Volume

2006

Start / End Page

330 / 337