Toward Unified Data and Algorithm Fairness via Adversarial Data Augmentation and Adaptive Model Fine-tuning
There is some recent research interest in algorithmic fairness for biased data. There are a variety of pre-, in-, and post-processing methods designed for this problem. However, these methods are exclusively targeting data unfairness and algorithmic unfairness. In this paper, we propose a novel intra-processing method to broaden the application scenario of fairness methods, which can simultaneously address the two bias sources. Since training modern deep models from scratch is expensive due to the enormous training data and the complicated structures, we propose an augmentation and fine-tuning framework. First, we design an adversarial attack to generate weighted samples disentangled with the protected attribute. Next, we identify the fair sub-structure in the biased model and fine-tune the model via weight reactivation. At last, we provide an optional joint training scheme for the augmentation and the fine-tuning. Our method can be combined with a variety of fairness measures. We benchmark our method and some related baselines to show the advantage and the scalability. Experimental results on several standard datasets demonstrate that our approach can effectively learn fair augmentation and achieve superior results to the state-of-the-art baselines. Our method also generalizes well to different types of data.