Skip to main content

Polyphone disambiguation for Mandarin Chinese using conditional neural network with multi-level embedding features

Publication ,  Conference
Cai, Z; Yang, Y; Zhang, C; Qin, X; Li, M
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2019

This paper describes a conditional neural network architecture for Mandarin Chinese polyphone disambiguation. The system is composed of a bidirectional recurrent neural network component acting as a sentence encoder to accumulate the context correlations, followed by a prediction network that maps the polyphonic character embeddings along with the conditions to corresponding pronunciations. We obtain the word-level condition from a pre-trained word-to-vector lookup table. One goal of polyphone disambiguation is to address the homograph problem existing in the front-end processing of Mandarin Chinese text-to-speech system. Our system achieves an accuracy of 94.69% on a publicly available polyphonic character dataset. To further validate our choices on the conditional feature, we investigate polyphone disambiguation systems with multi-level conditions respectively. The experimental results show that both the sentence-level and the word-level conditional embedding features are able to attain good performance for Mandarin Chinese polyphone disambiguation.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2019

Volume

2019-September

Start / End Page

2110 / 2114
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cai, Z., Yang, Y., Zhang, C., Qin, X., & Li, M. (2019). Polyphone disambiguation for Mandarin Chinese using conditional neural network with multi-level embedding features. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2019-September, pp. 2110–2114). https://doi.org/10.21437/Interspeech.2019-1235
Cai, Z., Y. Yang, C. Zhang, X. Qin, and M. Li. “Polyphone disambiguation for Mandarin Chinese using conditional neural network with multi-level embedding features.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019-September:2110–14, 2019. https://doi.org/10.21437/Interspeech.2019-1235.
Cai Z, Yang Y, Zhang C, Qin X, Li M. Polyphone disambiguation for Mandarin Chinese using conditional neural network with multi-level embedding features. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2019. p. 2110–4.
Cai, Z., et al. “Polyphone disambiguation for Mandarin Chinese using conditional neural network with multi-level embedding features.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-September, 2019, pp. 2110–14. Scopus, doi:10.21437/Interspeech.2019-1235.
Cai Z, Yang Y, Zhang C, Qin X, Li M. Polyphone disambiguation for Mandarin Chinese using conditional neural network with multi-level embedding features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2019. p. 2110–2114.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2019

Volume

2019-September

Start / End Page

2110 / 2114