Scholars@Duke publication: Global Vision Transformer Pruning with Hessian-Aware Saliency

Global Vision Transformer Pruning with Hessian-Aware Saliency

Publication , Conference

Yang, H; Yin, H; Shen, M; Molchanov, P; Li, H; Kautz, J

Published in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

January 1, 2023

Transformers yield state-of-the-art results across many tasks. However, their heuristically designed architecture impose huge computational costs during inference. This work aims on challenging the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage, where we redistribute the parameters both across transformer blocks and between different structures within the block via the first systematic attempt on global structural pruning. Dealing with diverse ViT structural components, we derive a novel Hessian-based structural pruning criteria comparable across all layers and structures, with latency-aware regularization for direct latency reduction. Performing iterative pruning on the DeiT-Base model leads to a new architecture family called NViT (Novel ViT), with a novel parameter redistribution that utilizes parameters more efficiently. On ImageNet-1K, NViT-Base achieves a 2.6× FLOPs reduction, 5.1× parameter reduction, and 1.9× run-time speedup over the DeiT-Base model in a near lossless manner. Smaller NViT variants achieve more than 1% accuracy gain at the same throughput of the DeiT Small/Tiny variants, as well as a lossless 3.3× parameter reduction over the SWIN-Small model. These results outperform prior art by a large margin. Further analysis is provided on the parameter redistribution insight of NViT, where we show the high prunability of ViT models, distinct sensitivity within ViT block, and unique parameter distribution trend across stacked ViT blocks. Our insights provide viability for a simple yet effective parameter redistribution rule towards more efficient ViTs for off-the-shelf performance boost.

Duke Scholars

Author Hai "Helen" Li Electrical and Computer Engineering

Published In

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

DOI

10.1109/CVPR52729.2023.01779

ISSN

1063-6919

Publication Date

January 1, 2023

Volume

2023-June

Start / End Page

18547 / 18557

Citation

APA

Chicago

ICMJE

MLA

NLM

Yang, H., Yin, H., Shen, M., Molchanov, P., Li, H., & Kautz, J. (2023). Global Vision Transformer Pruning with Hessian-Aware Saliency. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Vol. 2023-June, pp. 18547–18557). https://doi.org/10.1109/CVPR52729.2023.01779

Published In

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

DOI

10.1109/CVPR52729.2023.01779

ISSN

1063-6919

Publication Date

January 1, 2023

Volume

2023-June

Start / End Page

18547 / 18557