Scholars@Duke publication: The DKU-Duke-Lenovo system description for the fearless steps challenge phase III

The DKU-Duke-Lenovo system description for the fearless steps challenge phase III

Publication , Conference

Wang, W; Cai, D; Wang, J; Lin, Q; Wang, X; Hong, M; Li, M

Published in: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

January 1, 2021

This paper describes the systems developed by the DKU-Duke-Lenovo team for the Fearless Steps Challenge Phase III. For the speech activity detection (SAD) task, we employ the U-Net-based model which has not been used for SAD before, observing a DCF of 1.915% on the eval set. For the speaker identification (SID) task, we adopt the ResNet-SE and ECAPA-TDNN model, and we obtain a Top-5 accuracy of 86.21%. For the speaker diarization (SD) task, we employ several different clustering methods. Besides, domain adaptation, system fusion, and Target-Speaker Voice Activity Detection (TS-VAD) significantly improve the SD performance. We obtain a DER of 12.32% on track 2, and the major contribution is from our ResNet-based TS-VAD model. We finally achieve a first-place ranking for SD and SID and a second-place for SAD in the challenge.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

10.21437/Interspeech.2021-235

EISSN

2958-1796

ISSN

2308-457X

Publication Date

January 1, 2021

Volume

Start / End Page

1983 / 1987

Citation

APA

Chicago

ICMJE

MLA

NLM

Wang, W., Cai, D., Wang, J., Lin, Q., Wang, X., Hong, M., & Li, M. (2021). The DKU-Duke-Lenovo system description for the fearless steps challenge phase III. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech (Vol. 3, pp. 1983–1987). https://doi.org/10.21437/Interspeech.2021-235

Wang, W., D. Cai, J. Wang, Q. Lin, X. Wang, M. Hong, and M. Li. “The DKU-Duke-Lenovo system description for the fearless steps challenge phase III.” In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 3:1983–87, 2021. https://doi.org/10.21437/Interspeech.2021-235.

Wang W, Cai D, Wang J, Lin Q, Wang X, Hong M, et al. The DKU-Duke-Lenovo system description for the fearless steps challenge phase III. In: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2021. p. 1983–7.

Wang, W., et al. “The DKU-Duke-Lenovo system description for the fearless steps challenge phase III.” Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, vol. 3, 2021, pp. 1983–87. Scopus, doi:10.21437/Interspeech.2021-235.

Wang W, Cai D, Wang J, Lin Q, Wang X, Hong M, Li M. The DKU-Duke-Lenovo system description for the fearless steps challenge phase III. Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2021. p. 1983–1987.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

10.21437/Interspeech.2021-235

EISSN

2958-1796

ISSN

2308-457X

Publication Date

January 1, 2021

Volume

Start / End Page

1983 / 1987