How to obtain and run light and efficient deep learning networks

Published

Conference Paper

© 2019 IEEE. As the model size of deep neural networks (DNNs) grows for better performance, the increase in computational cost associated with training and testing makes it extremely difficulty to deploy DNNs on end/edge devices with limited resources while also satisfying the response time requirement. To address this challenge, model compression which compresses model size and thus reduces computation cost is widely adopted in deep learning society. However, the practical impacts of hardware design are often ignored in these algorithm-level solutions, such as the increase of the random accesses to memory hierarchy and the constraints of memory capacity. On the other side, limited understanding about the computational needs at algorithm level may lead to unrealistic assumptions during the hardware designs. In this work, we will discuss this mismatch and provide how our approach addresses it through an interactive design practice across both software and hardware levels.

Full Text

Duke Authors

Cited Authors

  • Chen, F; Wen, W; Song, L; Zhang, J; Li, HH; Chen, Y

Published Date

  • November 1, 2019

Published In

Volume / Issue

  • 2019-November /

International Standard Serial Number (ISSN)

  • 1092-3152

International Standard Book Number 13 (ISBN-13)

  • 9781728123509

Digital Object Identifier (DOI)

  • 10.1109/ICCAD45719.2019.8942106

Citation Source

  • Scopus