Skip to main content

Target Speaker Extraction with Curriculum Learning

Publication ,  Conference
Liu, Y; Liu, X; Miao, X; Yamagishi, J
Published in: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
January 1, 2024

This paper presents a novel approach to target speaker extraction (TSE) using Curriculum Learning (CL) techniques, addressing the challenge of distinguishing a target speaker's voice from a mixture containing interfering speakers. For efficient training, we propose designing a curriculum that selects subsets of increasing complexity, such as increasing similarity between target and interfering speakers, and that selects training data strategically. Our CL strategies include both variants using predefined difficulty measures (e.g. gender, speaker similarity, and signal-to-distortion ratio) and ones using the TSE's standard objective function, each designed to expose the model gradually to more challenging scenarios. Comprehensive testing on the Libri2talker dataset demonstrated that our CL strategies for TSE improved the performance, and the results markedly exceeded baseline models without CL about 1 dB.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2024

Start / End Page

4348 / 4352
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Liu, Y., Liu, X., Miao, X., & Yamagishi, J. (2024). Target Speaker Extraction with Curriculum Learning. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech (pp. 4348–4352). https://doi.org/10.21437/Interspeech.2024-1375
Liu, Y., X. Liu, X. Miao, and J. Yamagishi. “Target Speaker Extraction with Curriculum Learning.” In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 4348–52, 2024. https://doi.org/10.21437/Interspeech.2024-1375.
Liu Y, Liu X, Miao X, Yamagishi J. Target Speaker Extraction with Curriculum Learning. In: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2024. p. 4348–52.
Liu, Y., et al. “Target Speaker Extraction with Curriculum Learning.” Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2024, pp. 4348–52. Scopus, doi:10.21437/Interspeech.2024-1375.
Liu Y, Liu X, Miao X, Yamagishi J. Target Speaker Extraction with Curriculum Learning. Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2024. p. 4348–4352.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2024

Start / End Page

4348 / 4352