Conditional Transferring Features: Scaling GANs to Thousands of Classes with 30% Less High-Quality Data for Training
Generative adversarial network (GAN) can greatly improve the quality of unsupervised image generation. Previous GAN-based methods often require a large amount of high-quality training data. This work aims to reduce the use of high-quality data in training, meanwhile scaling up GANs to thousands of classes. We propose an image generation method based on conditional transferring features, which can capture pixel-level semantic changes when transforming low-quality images into high-quality ones. Self-supervision learning is then integrated into our GAN architecture to provide more label-free semantic supervisory information observed from the training data. As such, training our GAN architecture requires much fewer high-quality images with a small number of additional low-quality images. Experiments show that even removing 30% high-quality images from the training set, our method can still achieve better image synthesis quality on CIFAR-10, STL-10, ImageNet, and CASIA-HWDB1.0, compared to previous competitive methods. Experiments on ImageNet with 1,000 classes of images and CASIA-HWDB1.0 with 3,755 classes of Chinese handwriting characters also validate the scalability of our method on object classes. Ablation studies further validate the contribution of our conditional transferring features and self-supervision learning to the quality of our synthesized images.