Are deep learning models robust to partial object occlusion in visual recognition tasks?
Image classification models, including convolutional neural networks (CNNs), perform well on a variety of classification tasks but struggle under conditions of partial occlusion of relevant objects. Methods to improve performance under occlusion, including data augmentation, part-based clustering, and more inherently robust architectures, including Vision Transformer (ViT) models, have, to some extent, been evaluated on their ability to classify objects under partial occlusion. However, evaluations of these methods have largely relied on images containing artificial occlusion, since they are inexpensive to generate and label. Additionally, these methods are compared to early, now outdated models, and rarely to each other. We contribute the Image Recognition Under Occlusion (IRUO) dataset, based on the OVIS dataset in [1]. IRUO utilizes real-world and artificially occluded images to test and benchmark leading methods’ robustness to partial occlusion in visual recognition tasks. In addition, we contribute the design and results of a human study using images from IRUO evaluating human classification performance on multiple levels and types of occlusion. We find that ViT-based models show higher recognition accuracy than modern CNN-based models, which are more accurate than earlier CNN-based models, but that ViT models are still modestly below human accuracy. We also find that diffuse occlusion, in which relevant objects are seen through“holes” in occluders such as fences and leaves, can greatly reduce the accuracy of deep recognition models as compared to humans, especially CNNs.
Duke Scholars
Published In
DOI
ISSN
Publication Date
Volume
Related Subject Headings
- Artificial Intelligence & Image Processing
- 4611 Machine learning
- 4605 Data management and data science
- 4603 Computer vision and multimedia computation
- 0906 Electrical and Electronic Engineering
- 0806 Information Systems
- 0801 Artificial Intelligence and Image Processing
Citation
Published In
DOI
ISSN
Publication Date
Volume
Related Subject Headings
- Artificial Intelligence & Image Processing
- 4611 Machine learning
- 4605 Data management and data science
- 4603 Computer vision and multimedia computation
- 0906 Electrical and Electronic Engineering
- 0806 Information Systems
- 0801 Artificial Intelligence and Image Processing