Skip to main content

Automatic language identification with discriminative language characterization based on SVM

Publication ,  Journal Article
Suo, H; Li, M; Lu, P; Yan, Y
Published in: IEICE Transactions on Information and Systems
January 1, 2008

Robust automatic language identification (LID) is the task of identifying the language from a short utterance spoken by an unknown speaker. The mainstream approaches include parallel phone recognition language modeling (PPRLM), support vector machine (SVM) and the general Gaussian mixture models (GMMs). These systems map the cepstral features of spoken utterances into high level scores by classifiers. In this paper, in order to increase the dimension of the score vector and alleviate the inter-speaker variability within the same language, multiple data groups based on supervised speaker clustering are employed to generate the discriminative language characterization score vectors (DLCSV). The back-end SVM classifiers are used to model the probability distribution of each target language in the DLCSV space. Finally, the output scores of back-end classifiers are calibrated by a pair-wise posterior probability estimation (PPPE) algorithm. The proposed language identification frameworks are evaluated on 2003 NIST Language Recognition Evaluation (LRE) databases and the experiments show that the system described in this paper produces comparable results to the existing systems. Especially, the SVM framework achieves an equal error rate (EER) of 4.0% in the 30-second task and outperforms the state-of-art systems by more than 30% relative error reduction. Besides, the performances of proposed PPRLM and GMMs algorithms achieve an EER of 5.1% and 5.0% respectively. Copyright © 2008 The Institute of Electronics, Information and Communication Engineers.

Duke Scholars

Published In

IEICE Transactions on Information and Systems

DOI

EISSN

1745-1361

ISSN

0916-8532

Publication Date

January 1, 2008

Volume

E91-D

Issue

3

Start / End Page

567 / 575

Related Subject Headings

  • Information Systems
  • 46 Information and computing sciences
  • 1801 Law
  • 0906 Electrical and Electronic Engineering
  • 0806 Information Systems
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Suo, H., Li, M., Lu, P., & Yan, Y. (2008). Automatic language identification with discriminative language characterization based on SVM. IEICE Transactions on Information and Systems, E91-D(3), 567–575. https://doi.org/10.1093/ietisy/e91-d.3.567
Suo, H., M. Li, P. Lu, and Y. Yan. “Automatic language identification with discriminative language characterization based on SVM.” IEICE Transactions on Information and Systems E91-D, no. 3 (January 1, 2008): 567–75. https://doi.org/10.1093/ietisy/e91-d.3.567.
Suo H, Li M, Lu P, Yan Y. Automatic language identification with discriminative language characterization based on SVM. IEICE Transactions on Information and Systems. 2008 Jan 1;E91-D(3):567–75.
Suo, H., et al. “Automatic language identification with discriminative language characterization based on SVM.” IEICE Transactions on Information and Systems, vol. E91-D, no. 3, Jan. 2008, pp. 567–75. Scopus, doi:10.1093/ietisy/e91-d.3.567.
Suo H, Li M, Lu P, Yan Y. Automatic language identification with discriminative language characterization based on SVM. IEICE Transactions on Information and Systems. 2008 Jan 1;E91-D(3):567–575.

Published In

IEICE Transactions on Information and Systems

DOI

EISSN

1745-1361

ISSN

0916-8532

Publication Date

January 1, 2008

Volume

E91-D

Issue

3

Start / End Page

567 / 575

Related Subject Headings

  • Information Systems
  • 46 Information and computing sciences
  • 1801 Law
  • 0906 Electrical and Electronic Engineering
  • 0806 Information Systems