Skip to main content

The DKU-Duke-Lenovo system description for the fearless steps challenge phase III

Publication ,  Conference
Wang, W; Cai, D; Wang, J; Lin, Q; Wang, X; Hong, M; Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2021

This paper describes the systems developed by the DKU-Duke-Lenovo team for the Fearless Steps Challenge Phase III. For the speech activity detection (SAD) task, we employ the U-Net-based model which has not been used for SAD before, observing a DCF of 1.915% on the eval set. For the speaker identification (SID) task, we adopt the ResNet-SE and ECAPA-TDNN model, and we obtain a Top-5 accuracy of 86.21%. For the speaker diarization (SD) task, we employ several different clustering methods. Besides, domain adaptation, system fusion, and Target-Speaker Voice Activity Detection (TS-VAD) significantly improve the SD performance. We obtain a DER of 12.32% on track 2, and the major contribution is from our ResNet-based TS-VAD model. We finally achieve a first-place ranking for SD and SID and a second-place for SAD in the challenge.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2021

Volume

3

Start / End Page

1983 / 1987
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, W., Cai, D., Wang, J., Lin, Q., Wang, X., Hong, M., & Li, M. (2021). The DKU-Duke-Lenovo system description for the fearless steps challenge phase III. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 3, pp. 1983–1987). https://doi.org/10.21437/Interspeech.2021-235
Wang, W., D. Cai, J. Wang, Q. Lin, X. Wang, M. Hong, and M. Li. “The DKU-Duke-Lenovo system description for the fearless steps challenge phase III.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 3:1983–87, 2021. https://doi.org/10.21437/Interspeech.2021-235.
Wang W, Cai D, Wang J, Lin Q, Wang X, Hong M, et al. The DKU-Duke-Lenovo system description for the fearless steps challenge phase III. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. p. 1983–7.
Wang, W., et al. “The DKU-Duke-Lenovo system description for the fearless steps challenge phase III.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 3, 2021, pp. 1983–87. Scopus, doi:10.21437/Interspeech.2021-235.
Wang W, Cai D, Wang J, Lin Q, Wang X, Hong M, Li M. The DKU-Duke-Lenovo system description for the fearless steps challenge phase III. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. p. 1983–1987.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2021

Volume

3

Start / End Page

1983 / 1987