Atss-Net: Target speaker separation via attention-based neural network
Publication
, Conference
Li, T; Lin, Q; Bao, Y; Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2020
Recently, Convolutional Neural Network (CNN) and Long short-term memory (LSTM) based models have been introduced to deep learning-based target speaker separation. In this paper, we propose an Attention-based neural network (Atss-Net) in the spectrogram domain for the task. It allows the network to compute the correlation between each feature parallelly, and using shallower layers to extract more features, compared with the CNN-LSTM architecture. Experimental results show that our Atss-Net yields better performance than the VoiceFilter, although it only contains half of the parameters. Furthermore, our proposed model also demonstrates promising performance in speech enhancement.
Duke Scholars
Published In
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOI
EISSN
1990-9772
ISSN
2308-457X
Publication Date
January 1, 2020
Volume
2020-October
Start / End Page
1411 / 1415
Citation
APA
Chicago
ICMJE
MLA
NLM
Li, T., Lin, Q., Bao, Y., & Li, M. (2020). Atss-Net: Target speaker separation via attention-based neural network. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2020-October, pp. 1411–1415). https://doi.org/10.21437/Interspeech.2020-1436
Li, T., Q. Lin, Y. Bao, and M. Li. “Atss-Net: Target speaker separation via attention-based neural network.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2020-October:1411–15, 2020. https://doi.org/10.21437/Interspeech.2020-1436.
Li T, Lin Q, Bao Y, Li M. Atss-Net: Target speaker separation via attention-based neural network. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. p. 1411–5.
Li, T., et al. “Atss-Net: Target speaker separation via attention-based neural network.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2020-October, 2020, pp. 1411–15. Scopus, doi:10.21437/Interspeech.2020-1436.
Li T, Lin Q, Bao Y, Li M. Atss-Net: Target speaker separation via attention-based neural network. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. p. 1411–1415.
Published In
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOI
EISSN
1990-9772
ISSN
2308-457X
Publication Date
January 1, 2020
Volume
2020-October
Start / End Page
1411 / 1415