Skip to main content

3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks

Publication ,  Journal Article
Li, B; Doppa, JR; Pande, PP; Chakrabarty, K; Qiu, JX; Li, HH
Published in: ACM Journal on Emerging Technologies in Computing Systems
January 29, 2020

Deep neural network (DNN) models are being expanded to a broader range of applications. The computational capability of traditional hardware platforms cannot accommodate the growth of model complexity. Among recent technologies to accelerate DNN, resistive memory (ReRAM)-based processing-in-memory (PIM) emerged as a promising solution for DNN inference due to its high efficiency for matrix-based computation. We face two major technical challenges in extending the use of ReRAM-based accelerators for training: (1) full-precision data is essential in back-propagation; (2) the need to support both feed-forward and back-propagation aggravates the data-movement burden. We propose a heterogeneous architecture named as 3D-ReG, which leverages full-precision GPU to ensure training accuracy and low-overhead 3D integration to provide low-cost data movements. Moreover, we introduce conservative and aggressive task-mapping schemes, which partition the computation phases in different ways to balance execution efficiency and training accuracy. We evaluate 3D-ReG implemented with two 3D integration technologies, through-silicon vias (TSVs) and monolithic inter-tier vias (MIVs), and compare them with GPU-only and PIM-only counterparts. Various GPU-only platforms using two main-memory technologies (DRAM, ReRAM) and three interconnect technologies (2D, TSV, MIV) are evaluated as well. Experimental results show that 3D-ReG can achieve on average 5.64× training speedup and 3.56× higher energy efficiency compared with the GPU with DRAM as main memory, at the cost of 0.05%-3.39% accuracy drop. We define a new metric, gain-loss ratio (GLR), which quantitatively evaluates the capability of a DNN training hardware in terms of the model accuracy and hardware efficiency. The results of our comparison show that the aggressive task-mapping scheme on MIV-based 3D-ReG outperforms the other methods.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

ACM Journal on Emerging Technologies in Computing Systems

DOI

EISSN

1550-4840

ISSN

1550-4832

Publication Date

January 29, 2020

Volume

16

Issue

2

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 1007 Nanotechnology
  • 1006 Computer Hardware
  • 0906 Electrical and Electronic Engineering
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Li, B., Doppa, J. R., Pande, P. P., Chakrabarty, K., Qiu, J. X., & Li, H. H. (2020). 3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks. ACM Journal on Emerging Technologies in Computing Systems, 16(2). https://doi.org/10.1145/3375699
Li, B., J. R. Doppa, P. P. Pande, K. Chakrabarty, J. X. Qiu, and H. H. Li. “3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks.” ACM Journal on Emerging Technologies in Computing Systems 16, no. 2 (January 29, 2020). https://doi.org/10.1145/3375699.
Li B, Doppa JR, Pande PP, Chakrabarty K, Qiu JX, Li HH. 3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks. ACM Journal on Emerging Technologies in Computing Systems. 2020 Jan 29;16(2).
Li, B., et al. “3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks.” ACM Journal on Emerging Technologies in Computing Systems, vol. 16, no. 2, Jan. 2020. Scopus, doi:10.1145/3375699.
Li B, Doppa JR, Pande PP, Chakrabarty K, Qiu JX, Li HH. 3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks. ACM Journal on Emerging Technologies in Computing Systems. 2020 Jan 29;16(2).

Published In

ACM Journal on Emerging Technologies in Computing Systems

DOI

EISSN

1550-4840

ISSN

1550-4832

Publication Date

January 29, 2020

Volume

16

Issue

2

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 1007 Nanotechnology
  • 1006 Computer Hardware
  • 0906 Electrical and Electronic Engineering