Skip to main content

Speaker Anonymization Using Orthogonal Householder Neural Network

Publication ,  Journal Article
Miao, X; Wang, X; Cooper, E; Yamagishi, J; Tomashenko, N
Published in: IEEE ACM Transactions on Audio Speech and Language Processing
January 1, 2023

Speaker anonymization aims to conceal a speaker's identity while preserving content information in speech. Current mainstream neural-network speaker anonymization systems disentangle speech into prosody-related, content, and speaker representations. The speaker representation is then anonymized by a selection-based speaker anonymizer that uses a mean vector over a set of randomly selected speaker vectors from an external pool of English speakers. However, the resulting anonymized vectors are subject to severe privacy leakage against powerful attackers, reduction in speaker diversity, and language mismatch problems for unseen-language speaker anonymization. To generate diverse, language-neutral speaker vectors, this article proposes an anonymizer based on an orthogonal Householder neural network (OHNN). Specifically, the OHNN acts like a rotation to transform the original speaker vectors into anonymized speaker vectors, which are constrained to follow the distribution over the original speaker vector space. A basic classification loss is introduced to ensure that anonymized speaker vectors from different speakers have unique speaker identities. To further protect speaker identities, an improved classification loss and similarity loss are used to push original-anonymized sample pairs away from each other. Experiments on VoicePrivacy Challenge datasets in English and the AISHELL-3 dataset in Mandarin demonstrate the proposed anonymizer's effectiveness.

Duke Scholars

Published In

IEEE ACM Transactions on Audio Speech and Language Processing

DOI

EISSN

2329-9304

ISSN

2329-9290

Publication Date

January 1, 2023

Volume

31

Start / End Page

3681 / 3695
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Miao, X., Wang, X., Cooper, E., Yamagishi, J., & Tomashenko, N. (2023). Speaker Anonymization Using Orthogonal Householder Neural Network. IEEE ACM Transactions on Audio Speech and Language Processing, 31, 3681–3695. https://doi.org/10.1109/TASLP.2023.3313429
Miao, X., X. Wang, E. Cooper, J. Yamagishi, and N. Tomashenko. “Speaker Anonymization Using Orthogonal Householder Neural Network.” IEEE ACM Transactions on Audio Speech and Language Processing 31 (January 1, 2023): 3681–95. https://doi.org/10.1109/TASLP.2023.3313429.
Miao X, Wang X, Cooper E, Yamagishi J, Tomashenko N. Speaker Anonymization Using Orthogonal Householder Neural Network. IEEE ACM Transactions on Audio Speech and Language Processing. 2023 Jan 1;31:3681–95.
Miao, X., et al. “Speaker Anonymization Using Orthogonal Householder Neural Network.” IEEE ACM Transactions on Audio Speech and Language Processing, vol. 31, Jan. 2023, pp. 3681–95. Scopus, doi:10.1109/TASLP.2023.3313429.
Miao X, Wang X, Cooper E, Yamagishi J, Tomashenko N. Speaker Anonymization Using Orthogonal Householder Neural Network. IEEE ACM Transactions on Audio Speech and Language Processing. 2023 Jan 1;31:3681–3695.

Published In

IEEE ACM Transactions on Audio Speech and Language Processing

DOI

EISSN

2329-9304

ISSN

2329-9290

Publication Date

January 1, 2023

Volume

31

Start / End Page

3681 / 3695