Demystifying dropout
Dropout is a popular technique to train large-scale deep neural networks to alleviate the overfitting problem. To disclose the underlying reason for its gain, numerous works have tried to explain it from different perspectives. In this paper, unlike existing works, we explore it from a new perspective to provide new insight into this line of research. In detail, we disentangle the forward and backward pass of dropout. Then, we find that these two passes need different levels of noise to improve the generalization performance of deep neural networks. Based on this observation, we propose the augmented dropout, which employs different dropping strategies in the forward and backward pass, to improve the standard dropout. Experimental results have verified the effectiveness of our proposed method.