Skip to main content

Selected Publications


GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices

Journal Article Qeios · June 18, 2025 Generative Artificial Intelligence (GenAI) applies models and algorithms such as Large Language Models (LLMs) and Foundation Models (FMs) to generate new data. GenAI, as a promising approach, enables advanced capabilities in various applications, i ... Full text Cite

TMCSpeech: A Chinese TV and Movie Speech Dataset with Character Descriptions and a Character-Based Voice Generation Model

Conference Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · January 1, 2025 Recent research on text-guided speech synthesis has sparked considerable interest. This study explores the potential of leveraging publicly available internet video data for speech synthesis and character-based new voice generation. We introduce a multi-mo ... Full text Cite

Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm

Conference Mm 2024 Proceedings of the 32nd ACM International Conference on Multimedia · October 28, 2024 This research presents Muskits-ESPnet, a versatile toolkit that introduces new paradigms to Singing Voice Synthesis (SVS) through the application of pretrained audio models in both continuous and discrete approaches. Specifically, we explore discrete repre ... Full text Cite

Two-stage and Self-supervised Voice Conversion for Zero-Shot Dysarthric Speech Reconstruction

Conference Proceedings of 2024 International Conference on Asian Language Processing Ialp 2024 · January 1, 2024 Dysarthria is a motor speech disorder commonly associated with conditions such as cerebral palsy, Parkinson's disease, amyotrophic lateral sclerosis, and stroke. Individuals with dysarthria typically exhibit significant speech difficulties, including impre ... Full text Cite

Bridging Facial Imagery and Vocal Reality: Stable Diffusion-Enhanced Voice Generation

Conference 2024 14th International Symposium on Chinese Spoken Language Processing Iscslp 2024 · January 1, 2024 Generating novel voices in speech synthesis is a challenging task with potential for creating versatile voices that are needed in entertainment and research. One of the primary obstacles in this area is the lack of well-annotated voice descriptions for exp ... Full text Cite

Bisinger: Bilingual Singing Voice Synthesis

Conference 2023 IEEE Automatic Speech Recognition and Understanding Workshop Asru 2023 · January 1, 2023 Although Singing Voice Synthesis (SVS) has made great strides with Text-to-Speech (TTS) techniques, multilingual singing voice modeling remains relatively unexplored. This paper presents BiSinger, a bilingual pop SVS system for English and Chinese Mandarin ... Full text Cite

In Situ Atomic Force Microscopy Tracking of Nanoparticle Migration in Semicrystalline Polymers.

Journal Article ACS macro letters · June 2022 We present in situ tracking of silica nanoparticle (NP) migration from a poly(ethylene oxide) (PEO) melt into interlamellar region using in situ atomic force microscopy (AFM). Our results confirm the previous hypothesis that NPs migrate into ... Full text Cite