MPSSD: Multi-Path Fusion Single Shot Detector
Recent prevalent one stage detectors, such as single shot detector (SSD) and RetinaNet, are able to detect objects faster than two stage ones while maintaining comparable accuracy. To further boost the accuracy, many studies focus on enhancing the multi-scale feature pyramid. Most of these current proposals focus on strengthening features on one pyramid, ignoring the rich connection among different scale features. In contrast, we propose a novel multi-path design to fully utilize the localization and semantics information. First, we exploit the original SSD multi-scale features as our base pyramid. Then we fuse these features in different groups to generate multi-path feature pyramids. Finally, we combine these pyramids through a novel and effective aggregation module, to obtain the final informative pyramid for detection. Comparative experiments on benchmark PASCAL VOC and MS COCO datasets have shown that our proposed method outperforms many state-of-the-art detectors. As an illustrative example, for input image with size 512×512, we can achieve a mean Average Precision (mAP) of 81.8% on VOC2007 test and 33.1% mAP on COCO test-dev2015.