Skip to main content

Insights in-to-End Learning Scheme for Language Identification

Publication ,  Conference
Cai, W; Cai, Z; Liu, W; Wang, X; Li, M
Published in: ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings
September 10, 2018

A novel interpretable end-to-end learning scheme for language identification is proposed. It is in line with the classical GMM i-vector methods both theoretically and practically. In the end-to-end pipeline, a general encoding layer is employed on top of the frontend CNN, so that it can encode the variable-length input sequence into an utterance level vector automatically. After comparing with the state-of-the-art GMM i-vector methods, we give insights into CNN, and reveal its role and effect in the whole pipeline. We further introduce a general encoding layer, illustrating the reason why they might be appropriate for language identification. We elaborate on several typical encoding layers, including a temporal average pooling layer, a recurrent encoding layer and a novel learnable dictionary encoding layer. We conducted experiment on NIST LRE07 closed-set task, and the results show that our proposed end-to-end systems achieve state-of-the-art performance.

Duke Scholars

Published In

ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings

DOI

ISSN

1520-6149

Publication Date

September 10, 2018

Volume

2018-April

Start / End Page

5209 / 5213
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cai, W., Cai, Z., Liu, W., Wang, X., & Li, M. (2018). Insights in-to-End Learning Scheme for Language Identification. In ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings (Vol. 2018-April, pp. 5209–5213). https://doi.org/10.1109/ICASSP.2018.8462026
Cai, W., Z. Cai, W. Liu, X. Wang, and M. Li. “Insights in-to-End Learning Scheme for Language Identification.” In ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2018-April:5209–13, 2018. https://doi.org/10.1109/ICASSP.2018.8462026.
Cai W, Cai Z, Liu W, Wang X, Li M. Insights in-to-End Learning Scheme for Language Identification. In: ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. 2018. p. 5209–13.
Cai, W., et al. “Insights in-to-End Learning Scheme for Language Identification.” ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, vol. 2018-April, 2018, pp. 5209–13. Scopus, doi:10.1109/ICASSP.2018.8462026.
Cai W, Cai Z, Liu W, Wang X, Li M. Insights in-to-End Learning Scheme for Language Identification. ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. 2018. p. 5209–5213.

Published In

ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings

DOI

ISSN

1520-6149

Publication Date

September 10, 2018

Volume

2018-April

Start / End Page

5209 / 5213