Skip to main content

Automatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens

Publication ,  Conference
Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2014

This paper presents an automatic speaker physical load recognition approach using posterior probability based features from acoustic and phonetic tokens. In this method, the tokens for calculating the posterior probability or zero-order statistics are extended from the conventional MFCC trained Gaussian Mixture Models (GMM) components to parallel phonetic phonemes and tandem feature trained GMM components. Phoneme recognizers from five different languages are employed to extract the phoneme posterior probabilities. We show that these histogram style features at both the acoustic and phonetic levels are effective and complementary for capturing the speaker physical load information from short utterances. Support vector machine is adopted as the supervised classifier. By combining the proposed methods with the OpenSMILE baseline which covers the acoustic and prosodic information further improves the final performance. The proposed fusion system achieves 70.18% and 72.81% unweighted accuracy on the validation and test set of the Munich Bio-voice Corpus for the binary physical load level recognition task in the INTERSPEECH 2014 Computational Paralinguistics Challenge.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2014

Start / End Page

437 / 441
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Li, M. (2014). Automatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 437–441). https://doi.org/10.21437/interspeech.2014-106
Li, M. “Automatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 437–41, 2014. https://doi.org/10.21437/interspeech.2014-106.
Li M. Automatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2014. p. 437–41.
Li, M. “Automatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2014, pp. 437–41. Scopus, doi:10.21437/interspeech.2014-106.
Li M. Automatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2014. p. 437–441.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2014

Start / End Page

437 / 441