Scholars@Duke publication: The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02

The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02

Publication , Conference

Lin, Q; Li, T; Li, M

Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

January 1, 2020

This paper describes the systems developed by the DKU team for the Fearless Steps Challenge Phase-02 competition. For the Speech Activity Detection task, we start with the Long Short-Term Memory (LSTM) system and then apply the ResNet-LSTM improvement. Our ResNet-LSTM system reduces the DCF error by about 38% relatively in comparison with the LSTM baseline. We also discuss the system performance with additional training corpora included, and the lowest DCF of 1.406% on the Eval Set is gained with system pre-training. As for the Speaker Identification task, we employ the Deep ResNet vector system, which receives a variable-length feature sequence and directly generates speaker posteriors. The pre-training process with Voxceleb is also considered, and our best-performing system achieves the Top-5 accuracy of 92.393% on the Eval Set.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

10.21437/Interspeech.2020-1915

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2020

Volume

2020-October

Start / End Page

2607 / 2611

Citation

APA

Chicago

ICMJE

MLA

NLM

Lin, Q., Li, T., & Li, M. (2020). The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2020-October, pp. 2607–2611). https://doi.org/10.21437/Interspeech.2020-1915

Lin, Q., T. Li, and M. Li. “The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2020-October:2607–11, 2020. https://doi.org/10.21437/Interspeech.2020-1915.

Lin Q, Li T, Li M. The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. p. 2607–11.

Lin, Q., et al. “The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2020-October, 2020, pp. 2607–11. Scopus, doi:10.21437/Interspeech.2020-1915.

Lin Q, Li T, Li M. The DKU speech activity detection and speaker identification systems for fearless steps challenge phase-02. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. p. 2607–2611.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

10.21437/Interspeech.2020-1915

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2020

Volume

2020-October

Start / End Page

2607 / 2611