Scholars@Duke publication: Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion

Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion

Publication , Conference

Hua, H; Chen, Z; Zhang, Y; Li, M; Zhang, P

Published in: Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia

October 10, 2022

Audio deep synthesis techniques have been able to generate highquality speech whose authenticity is difficult for humans to recognize. Meanwhile, many anti-spoofing systems have been developed to capture artifacts in the synthesized speech that are imperceptible to human hearing, thus a continuous escalating race of 'attacking and defending' in voice deepfake has started. Hence, to further improve the probability of successfully cheating anti-spoofing systems, we propose a fully end-to-end, any-to-many voice conversion method based on a non-autoregressive structure with the addition of two light but strong post-processing strategies namely silence replacement and global noise perturbation. Experimental results show that the proposed method performs better than current baselines in fooling several state-of-the-art anti-spoofing systems. Better naturalness and speaker similarity are also achieved, resulting in our proposed method showing high deception performance against humans.

Duke Scholars

Author Ming Li DKU Faculty

Published In

Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia

DOI

10.1145/3552466.3556532

Publication Date

October 10, 2022

Start / End Page

93 / 100

Citation

APA

Chicago

ICMJE

MLA

NLM

Hua, H., Chen, Z., Zhang, Y., Li, M., & Zhang, P. (2022). Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion. In Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (pp. 93–100). https://doi.org/10.1145/3552466.3556532

Hua, H., Z. Chen, Y. Zhang, M. Li, and P. Zhang. “Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion.” In Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 93–100, 2022. https://doi.org/10.1145/3552466.3556532.

Hua H, Chen Z, Zhang Y, Li M, Zhang P. Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion. In: Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia. 2022. p. 93–100.

Hua, H., et al. “Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion.” Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022, pp. 93–100. Scopus, doi:10.1145/3552466.3556532.

Hua H, Chen Z, Zhang Y, Li M, Zhang P. Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion. Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia. 2022. p. 93–100.

Published In

Ddam 2022 Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia

DOI

10.1145/3552466.3556532

Publication Date

October 10, 2022

Start / End Page

93 / 100