Skip to main content

An end-to-end deep learning framework with speech emotion recognition of atypical individuals

Publication ,  Conference
Tang, D; Zeng, J; Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2018

The goal of the ongoing ComParE 2018 Atypical Affect sub-challenge is to recognize the emotional states of atypical individuals. In this work, we present three modeling methods under the end-to-end learning framework, namely CNN combined with extended features, CNN+RNN and ResNet, respectively. Furthermore, we investigate multiple data augmentation, balancing and sampling methods to further enhance the system performance. The experimental results show that data balancing and augmentation increase the unweighted accuracy (UAR) by 10% absolutely. After score level fusion, our proposed system achieves 48.8% UAR on the develop dataset.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2018

Volume

2018-September

Start / End Page

162 / 166
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Tang, D., Zeng, J., & Li, M. (2018). An end-to-end deep learning framework with speech emotion recognition of atypical individuals. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2018-September, pp. 162–166). https://doi.org/10.21437/Interspeech.2018-2581
Tang, D., J. Zeng, and M. Li. “An end-to-end deep learning framework with speech emotion recognition of atypical individuals.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2018-September:162–66, 2018. https://doi.org/10.21437/Interspeech.2018-2581.
Tang D, Zeng J, Li M. An end-to-end deep learning framework with speech emotion recognition of atypical individuals. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2018. p. 162–6.
Tang, D., et al. “An end-to-end deep learning framework with speech emotion recognition of atypical individuals.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2018-September, 2018, pp. 162–66. Scopus, doi:10.21437/Interspeech.2018-2581.
Tang D, Zeng J, Li M. An end-to-end deep learning framework with speech emotion recognition of atypical individuals. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2018. p. 162–166.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2018

Volume

2018-September

Start / End Page

162 / 166