Scholars@Duke publication: Bisinger: Bilingual Singing Voice Synthesis

Bisinger: Bilingual Singing Voice Synthesis

Publication , Conference

Zhou, H; Lin, Y; Shi, Y; Sun, P; Li, M

Published in: 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023

January 1, 2023

Although Singing Voice Synthesis (SVS) has made great strides with Text-to-Speech (TTS) techniques, multilingual singing voice modeling remains relatively unexplored. This paper presents BiSinger, a bilingual pop SVS system for English and Chinese Mandarin. Current systems require separate models per language and cannot accurately represent both Chinese and English, hindering code-switch SVS. To address this gap, we design a shared representation between Chinese and English singing voices, achieved by using the CMU dictionary with mapping rules. We fuse monolingual singing datasets with open-source singing voice conversion techniques to generate bilingual singing voices while also exploring the potential use of bilingual speech data. Experiments affirm that our language-independent representation and incorporation of related datasets enable a single model with enhanced performance in English and code-switch SVS while maintaining Chinese song performance. Audio samples are available at https://bisinger-svs.github.io.

Duke Scholars

Author Yueqian Lin

Author Ming Li DKU Faculty

Published In

2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023

DOI

10.1109/ASRU57964.2023.10389659

Publication Date

January 1, 2023

Citation

APA

Chicago

ICMJE

MLA

NLM

Zhou, H., Lin, Y., Shi, Y., Sun, P., & Li, M. (2023). Bisinger: Bilingual Singing Voice Synthesis. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023. https://doi.org/10.1109/ASRU57964.2023.10389659

Zhou, H., Y. Lin, Y. Shi, P. Sun, and M. Li. “Bisinger: Bilingual Singing Voice Synthesis.” In 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023, 2023. https://doi.org/10.1109/ASRU57964.2023.10389659.

Zhou H, Lin Y, Shi Y, Sun P, Li M. Bisinger: Bilingual Singing Voice Synthesis. In: 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023. 2023.

Zhou, H., et al. “Bisinger: Bilingual Singing Voice Synthesis.” 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023, 2023. Scopus, doi:10.1109/ASRU57964.2023.10389659.

Zhou H, Lin Y, Shi Y, Sun P, Li M. Bisinger: Bilingual Singing Voice Synthesis. 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023. 2023.

Published In

2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023

DOI

10.1109/ASRU57964.2023.10389659

Publication Date

January 1, 2023