Attention Mixture Network for Crowd Counting via Binarization Transfer
Crowd counting endeavors to estimate the numerical count of individuals present within an image depicting a gathering of people. In recent years, there has been notable and gradual advancement in the realm of crowd counting, driven by the integration of attention mechanisms. Nonetheless, these methodologies have predominantly concentrated on either binary or non-binary attention maps in isolation. The binary attention map serves to enhance model performance by distinguishing between the intricate background and the distribution of the crowd. On the other hand, the non-binary attention map is centered around capturing the density gradient within the crowd region. In order to harness the potential of these two attention maps concurrently, we propose a novel Binarization Transfer Module (BTM) for the binarization process in network training and Attention Mixture Net (AMNet) based on BTM. The distinctive attribute of AMNet lies in its ability to simultaneously exploit the binary and non-binary attention maps in a harmonized manner. Furthermore, it effectively mitigates the disruptive influence of a cluttered background through the integration of the binarization transfer module. We have evaluated our method on four popular crowd-counting datasets (ShanghaiTech PartA and PartB, UCF_CC_50, WorldExpo'10, and UCF-QNRF), and AMNet achieves significant improvement in crowd-counting accuracy and outperforms the state-of-the-art methods.