Skip to main content

Insights in-to-End Learning Scheme for Language Identification

Publication ,  Conference
Cai, W; Cai, Z; Liu, W; Wang, X; Li, M
Published in: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
September 10, 2018

A novel interpretable end-to-end learning scheme for language identification is proposed. It is in line with the classical GMM i-vector methods both theoretically and practically. In the end-to-end pipeline, a general encoding layer is employed on top of the frontend CNN, so that it can encode the variable-length input sequence into an utterance level vector automatically. After comparing with the state-of-the-art GMM i-vector methods, we give insights into CNN, and reveal its role and effect in the whole pipeline. We further introduce a general encoding layer, illustrating the reason why they might be appropriate for language identification. We elaborate on several typical encoding layers, including a temporal average pooling layer, a recurrent encoding layer and a novel learnable dictionary encoding layer. We conducted experiment on NIST LRE07 closed-set task, and the results show that our proposed end-to-end systems achieve state-of-the-art performance.

Duke Scholars

Published In

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

DOI

ISSN

1520-6149

Publication Date

September 10, 2018

Volume

2018-April

Start / End Page

5209 / 5213
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cai, W., Cai, Z., Liu, W., Wang, X., & Li, M. (2018). Insights in-to-End Learning Scheme for Language Identification. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 2018-April, pp. 5209–5213). https://doi.org/10.1109/ICASSP.2018.8462026
Cai, W., Z. Cai, W. Liu, X. Wang, and M. Li. “Insights in-to-End Learning Scheme for Language Identification.” In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2018-April:5209–13, 2018. https://doi.org/10.1109/ICASSP.2018.8462026.
Cai W, Cai Z, Liu W, Wang X, Li M. Insights in-to-End Learning Scheme for Language Identification. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2018. p. 5209–13.
Cai, W., et al. “Insights in-to-End Learning Scheme for Language Identification.” ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2018-April, 2018, pp. 5209–13. Scopus, doi:10.1109/ICASSP.2018.8462026.
Cai W, Cai Z, Liu W, Wang X, Li M. Insights in-to-End Learning Scheme for Language Identification. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2018. p. 5209–5213.

Published In

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

DOI

ISSN

1520-6149

Publication Date

September 10, 2018

Volume

2018-April

Start / End Page

5209 / 5213