Scholars@Duke publication: The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion

The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion

Publication , Conference

Cai, W; Wu, H; Cai, D; Li, M

Published in: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

January 1, 2019

This paper describes our DKU replay detection system for the ASVspoof 2019 challenge. The goal is to develop spoofing countermeasure for automatic speaker recognition in physical access scenario. We leverage the countermeasure system pipeline from four aspects, including the data augmentation, feature representation, classification, and fusion. First, we introduce an utterance-level deep learning framework for antispoofing. It receives the variable-length feature sequence and outputs the utterance-level scores directly. Based on the framework, we try out various kinds of input feature representations extracted from either the magnitude spectrum or phase spectrum. Besides, we also perform the data augmentation strategy by applying the speed perturbation on the raw waveform. Our best single system employs a residual neural network trained by the speed-perturbed group delay gram. It achieves EER of 1.04% on the development set, as well as EER of 1.08% on the evaluation set. Finally, using the simple average score from several single systems can further improve the performance. EER of 0.24% on the development set and 0.66% on the evaluation set is obtained for our primary system.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

10.21437/Interspeech.2019-1230

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2019

Volume

2019-September

Start / End Page

1023 / 1027

Citation

APA

Chicago

ICMJE

MLA

NLM

Cai, W., Wu, H., Cai, D., & Li, M. (2019). The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech (Vol. 2019-September, pp. 1023–1027). https://doi.org/10.21437/Interspeech.2019-1230

Cai, W., H. Wu, D. Cai, and M. Li. “The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion.” In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2019-September:1023–27, 2019. https://doi.org/10.21437/Interspeech.2019-1230.

Cai W, Wu H, Cai D, Li M. The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion. In: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2019. p. 1023–7.

Cai, W., et al. “The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion.” Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, vol. 2019-September, 2019, pp. 1023–27. Scopus, doi:10.21437/Interspeech.2019-1230.

Cai W, Wu H, Cai D, Li M. The DKU replay detection system for the asvspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion. Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2019. p. 1023–1027.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

10.21437/Interspeech.2019-1230

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2019

Volume

2019-September

Start / End Page

1023 / 1027