Skip to main content

The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02

Publication ,  Conference
Lin, Q; Li, T; Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2020

This paper describes the systems developed by the DKU team for the Fearless Steps Challenge Phase-02 competition. For the Speech Activity Detection task, we start with the Long Short-Term Memory (LSTM) system and then apply the ResNet-LSTM improvement. Our ResNet-LSTM system reduces the DCF error by about 38% relatively in comparison with the LSTM baseline. We also discuss the system performance with additional training corpora included, and the lowest DCF of 1.406% on the Eval Set is gained with system pre-training. As for the Speaker Identification task, we employ the Deep ResNet vector system, which receives a variable-length feature sequence and directly generates speaker posteriors. The pre-training process with Voxceleb is also considered, and our best-performing system achieves the Top-5 accuracy of 92.393% on the Eval Set.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2020

Volume

2020-October

Start / End Page

2607 / 2611
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Lin, Q., Li, T., & Li, M. (2020). The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2020-October, pp. 2607–2611). https://doi.org/10.21437/Interspeech.2020-1915
Lin, Q., T. Li, and M. Li. “The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2020-October:2607–11, 2020. https://doi.org/10.21437/Interspeech.2020-1915.
Lin Q, Li T, Li M. The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. p. 2607–11.
Lin, Q., et al. “The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2020-October, 2020, pp. 2607–11. Scopus, doi:10.21437/Interspeech.2020-1915.
Lin Q, Li T, Li M. The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. p. 2607–2611.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2020

Volume

2020-October

Start / End Page

2607 / 2611