Scholars@Duke publication: Atss-Net: Target speaker separation via attention-based neural network

Atss-Net: Target speaker separation via attention-based neural network

Publication , Conference

Li, T; Lin, Q; Bao, Y; Li, M

Published in: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

January 1, 2020

Recently, Convolutional Neural Network (CNN) and Long short-term memory (LSTM) based models have been introduced to deep learning-based target speaker separation. In this paper, we propose an Attention-based neural network (Atss-Net) in the spectrogram domain for the task. It allows the network to compute the correlation between each feature parallelly, and using shallower layers to extract more features, compared with the CNN-LSTM architecture. Experimental results show that our Atss-Net yields better performance than the VoiceFilter, although it only contains half of the parameters. Furthermore, our proposed model also demonstrates promising performance in speech enhancement.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

10.21437/Interspeech.2020-1436

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2020

Volume

2020-October

Start / End Page

1411 / 1415

Citation

APA

Chicago

ICMJE

MLA

NLM

Li, T., Lin, Q., Bao, Y., & Li, M. (2020). Atss-Net: Target speaker separation via attention-based neural network. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech (Vol. 2020-October, pp. 1411–1415). https://doi.org/10.21437/Interspeech.2020-1436

Li, T., Q. Lin, Y. Bao, and M. Li. “Atss-Net: Target speaker separation via attention-based neural network.” In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2020-October:1411–15, 2020. https://doi.org/10.21437/Interspeech.2020-1436.

Li T, Lin Q, Bao Y, Li M. Atss-Net: Target speaker separation via attention-based neural network. In: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2020. p. 1411–5.

Li, T., et al. “Atss-Net: Target speaker separation via attention-based neural network.” Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, vol. 2020-October, 2020, pp. 1411–15. Scopus, doi:10.21437/Interspeech.2020-1436.

Li T, Lin Q, Bao Y, Li M. Atss-Net: Target speaker separation via attention-based neural network. Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2020. p. 1411–1415.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

10.21437/Interspeech.2020-1436

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2020

Volume

2020-October

Start / End Page

1411 / 1415