Skip to main content

Speaker verification using Lasso based sparse total variability supervector with PLDA modeling

Publication ,  Conference
Li, M; Lu, C; Wang, A; Narayanan, S
Published in: 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012
December 1, 2012

In this paper, we propose a Lasso based framework to generate the sparse total variability supervectors (s-vectors). Rather than the factor analysis framework, which uses a low dimensional Eigenvoice subspace to represent the mean supervector, the proposed Lasso approach utilizes the l1 norm regularized least square estimation to project the mean supervector on a pre-defined dictionary. The number of samples in this dictionary is appreciably larger than the typical Eigenvoice rank but the l1 norm of the Lasso solution vector is constrained. Only a small number of samples in the dictionary are selected for representing the mean supervector, and most of the dictionary coefficients in the Lasso solution are 0. We denote these sparse dictionary coefficient vectors in the Lasso solutions as the s-vectors and model them using probabilistic linear discriminant analysis (PLDA) for speaker verification. The proposed approach generates comparable results to the conventional cosine distance scoring based i-vector system and improvement is achieved by fusing the proposed method with either the i-vector system or the joint factor analysis (JFA) system. Experiments results are reported on the female part of the NIST SRE 2010 task with common condition 5 using equal error rate (EER), norm old minDCF and norm new minDCF values. The norm new minDCF cost was reduced by 7.5% and 9.6% relative when fusing the proposed approach with the baseline JFA and i-vector systems, respectively. Similarly, 8.3% and 10.7% relative norm old minDCF cost reduction was observed in the fusion. © 2012 APSIPA.

Duke Scholars

Published In

2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012

Publication Date

December 1, 2012
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Li, M., Lu, C., Wang, A., & Narayanan, S. (2012). Speaker verification using Lasso based sparse total variability supervector with PLDA modeling. In 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012.
Li, M., C. Lu, A. Wang, and S. Narayanan. “Speaker verification using Lasso based sparse total variability supervector with PLDA modeling.” In 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012, 2012.
Li M, Lu C, Wang A, Narayanan S. Speaker verification using Lasso based sparse total variability supervector with PLDA modeling. In: 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012. 2012.
Li, M., et al. “Speaker verification using Lasso based sparse total variability supervector with PLDA modeling.” 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012, 2012.
Li M, Lu C, Wang A, Narayanan S. Speaker verification using Lasso based sparse total variability supervector with PLDA modeling. 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012. 2012.

Published In

2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012

Publication Date

December 1, 2012