A high utilization FPGA-based accelerator for variable-scale convolutional neural network

Conference Paper

Convolutional Neural Network (CNN) plays an essential role in computer vision applications for high classification accuracy and robust generalization capability. In recent years, various GPU-based or application-specific hardware approaches have been proposed to accelerate CNN computations. However, for variable-scale CNNs, the utilization of DSP on chip is not able to achieve very high due to the boundary of image. In this paper, we propose an optimization framework to solve boundary problem and connect our accelerator with ARM processors and DDR4 memory through dual Advanced eXtensible Interface (AXI) bus. Each port is capable of a peak throughout of 1.6 GB/s in full duplex. The accelerator has the ability to perform 160 G-op/s at peak and achieve 96% computing resource utilization.

Full Text

Duke Authors

Cited Authors

  • Li, X; Cai, Y; Han, J; Zeng, X

Published Date

  • July 1, 2017

Published In

Volume / Issue

  • 2017-October /

Start / End Page

  • 944 - 947

Electronic International Standard Serial Number (EISSN)

  • 2162-755X

International Standard Serial Number (ISSN)

  • 2162-7541

International Standard Book Number 13 (ISBN-13)

  • 9781509066247

Digital Object Identifier (DOI)

  • 10.1109/ASICON.2017.8252633

Citation Source

  • Scopus