Skip to main content

FAIRFIL: CONTRASTIVE NEURAL DEBIASING METHOD FOR PRETRAINED TEXT ENCODERS

Publication ,  Conference
Cheng, P; Hao, W; Yuan, S; Si, S; Carin, L
Published in: ICLR 2021 - 9th International Conference on Learning Representations
January 1, 2021

Pretrained text encoders, such as BERT, have been applied increasingly in various natural language processing (NLP) tasks, and have recently demonstrated significant performance gains. However, recent studies have demonstrated the existence of social bias in these pretrained NLP models. Although prior works have made progress on word-level debiasing, improved sentence-level fairness of pretrained encoders still lacks exploration. In this paper, we proposed the first neural debiasing method for a pretrained sentence encoder, which transforms the pretrained encoder outputs into debiased representations via a fair filter (FairFil) network. To learn the FairFil, we introduce a contrastive learning framework that not only minimizes the correlation between filtered embeddings and bias words but also preserves rich semantic information of the original sentences. On real-world datasets, our FairFil effectively reduces the bias degree of pretrained text encoders, while continuously showing desirable performance on downstream tasks. Moreover, our post hoc method does not require any retraining of the text encoders, further enlarging FairFil's application space.

Duke Scholars

Published In

ICLR 2021 - 9th International Conference on Learning Representations

Publication Date

January 1, 2021
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cheng, P., Hao, W., Yuan, S., Si, S., & Carin, L. (2021). FAIRFIL: CONTRASTIVE NEURAL DEBIASING METHOD FOR PRETRAINED TEXT ENCODERS. In ICLR 2021 - 9th International Conference on Learning Representations.
Cheng, P., W. Hao, S. Yuan, S. Si, and L. Carin. “FAIRFIL: CONTRASTIVE NEURAL DEBIASING METHOD FOR PRETRAINED TEXT ENCODERS.” In ICLR 2021 - 9th International Conference on Learning Representations, 2021.
Cheng P, Hao W, Yuan S, Si S, Carin L. FAIRFIL: CONTRASTIVE NEURAL DEBIASING METHOD FOR PRETRAINED TEXT ENCODERS. In: ICLR 2021 - 9th International Conference on Learning Representations. 2021.
Cheng, P., et al. “FAIRFIL: CONTRASTIVE NEURAL DEBIASING METHOD FOR PRETRAINED TEXT ENCODERS.” ICLR 2021 - 9th International Conference on Learning Representations, 2021.
Cheng P, Hao W, Yuan S, Si S, Carin L. FAIRFIL: CONTRASTIVE NEURAL DEBIASING METHOD FOR PRETRAINED TEXT ENCODERS. ICLR 2021 - 9th International Conference on Learning Representations. 2021.

Published In

ICLR 2021 - 9th International Conference on Learning Representations

Publication Date

January 1, 2021