Scholars@Duke publication: Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors

Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors

Publication , Conference

Bone, D; Black, MP; Li, M; Metallinou, A; Lee, S; Narayanan, SS

Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

December 1, 2011

Speaker state recognition is a challenging problem due to speaker and context variability. Intoxication detection is an important area of paralinguistic speech research with potential real-world applications. In this work, we build upon a base set of various static acoustic features by proposing the combination of several different methods for this learning task. The methods include extracting hierarchical acoustic features, performing iterative speaker normalization, and using a set of GMM supervectors. We obtain an optimal unweighted recall for intoxication recognition using score-level fusion of these subsystems. Unweighted average recall performance is 70.54% on the test set, an improvement of 4.64% absolute (7.04% relative) over the baseline model accuracy of 65.9%. Copyright © 2011 ISCA.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

EISSN

1990-9772

Publication Date

December 1, 2011

Start / End Page

3217 / 3220

Citation

APA

Chicago

ICMJE

MLA

NLM

Bone, D., Black, M. P., Li, M., Metallinou, A., Lee, S., & Narayanan, S. S. (2011). Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 3217–3220).

Bone, D., M. P. Black, M. Li, A. Metallinou, S. Lee, and S. S. Narayanan. “Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 3217–20, 2011.

Bone D, Black MP, Li M, Metallinou A, Lee S, Narayanan SS. Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. p. 3217–20.

Bone, D., et al. “Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, pp. 3217–20.

Bone D, Black MP, Li M, Metallinou A, Lee S, Narayanan SS. Intoxicated speech detection by fusion of speaker normalized hierarchical features and GMM supervectors. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. p. 3217–3220.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

EISSN

1990-9772

Publication Date

December 1, 2011

Start / End Page

3217 / 3220