Skip to main content

Intelligibility classification of pathological speech using fusion of multiple subsystems

Publication ,  Conference
Kim, J; Kumar, N; Tsiartas, A; Li, M; Narayanan, SS
Published in: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
December 1, 2012

Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the ar-ticulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. While automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we design multiple subsystems to address different aspects of pathological speech characteristics. These subsystems are then fused at the binary hard score level (intelligible or not intelligible) using Bayesian networks. Results show that subsystems, such as multiple language phoneme probability system, prosodic and intonational subsystem, and voice quality and pronunciation subsystem, have discriminating power for intelligibility (9.8%, 17.1%, 14.6% higher than by-chance respectively). Noisy-Majority based fusion shows 66.4% accuracy, but the performance improvement by fusion is not made. Also, voice clustering based joint classification is applied to minimize misclassification of the best subsystem, and it shows the best classification accuracy (79.9% on dev set, 76.8% on test set).

Duke Scholars

Published In

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012

Publication Date

December 1, 2012

Volume

1

Start / End Page

534 / 537
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Kim, J., Kumar, N., Tsiartas, A., Li, M., & Narayanan, S. S. (2012). Intelligibility classification of pathological speech using fusion of multiple subsystems. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (Vol. 1, pp. 534–537).
Kim, J., N. Kumar, A. Tsiartas, M. Li, and S. S. Narayanan. “Intelligibility classification of pathological speech using fusion of multiple subsystems.” In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, 1:534–37, 2012.
Kim J, Kumar N, Tsiartas A, Li M, Narayanan SS. Intelligibility classification of pathological speech using fusion of multiple subsystems. In: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. 2012. p. 534–7.
Kim, J., et al. “Intelligibility classification of pathological speech using fusion of multiple subsystems.” 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, vol. 1, 2012, pp. 534–37.
Kim J, Kumar N, Tsiartas A, Li M, Narayanan SS. Intelligibility classification of pathological speech using fusion of multiple subsystems. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. 2012. p. 534–537.

Published In

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012

Publication Date

December 1, 2012

Volume

1

Start / End Page

534 / 537