Scholars@Duke publication: 3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks

3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks

Publication , Journal Article

Li, B; Doppa, JR; Pande, PP; Chakrabarty, K; Qiu, JX; Li, HH

Published in: ACM Journal on Emerging Technologies in Computing Systems

January 29, 2020

Deep neural network (DNN) models are being expanded to a broader range of applications. The computational capability of traditional hardware platforms cannot accommodate the growth of model complexity. Among recent technologies to accelerate DNN, resistive memory (ReRAM)-based processing-in-memory (PIM) emerged as a promising solution for DNN inference due to its high efficiency for matrix-based computation. We face two major technical challenges in extending the use of ReRAM-based accelerators for training: (1) full-precision data is essential in back-propagation; (2) the need to support both feed-forward and back-propagation aggravates the data-movement burden. We propose a heterogeneous architecture named as 3D-ReG, which leverages full-precision GPU to ensure training accuracy and low-overhead 3D integration to provide low-cost data movements. Moreover, we introduce conservative and aggressive task-mapping schemes, which partition the computation phases in different ways to balance execution efficiency and training accuracy. We evaluate 3D-ReG implemented with two 3D integration technologies, through-silicon vias (TSVs) and monolithic inter-tier vias (MIVs), and compare them with GPU-only and PIM-only counterparts. Various GPU-only platforms using two main-memory technologies (DRAM, ReRAM) and three interconnect technologies (2D, TSV, MIV) are evaluated as well. Experimental results show that 3D-ReG can achieve on average 5.64× training speedup and 3.56× higher energy efficiency compared with the GPU with DRAM as main memory, at the cost of 0.05%-3.39% accuracy drop. We define a new metric, gain-loss ratio (GLR), which quantitatively evaluates the capability of a DNN training hardware in terms of the model accuracy and hardware efficiency. The results of our comparison show that the aggressive task-mapping scheme on MIV-based 3D-ReG outperforms the other methods.

Duke Scholars

Author Hai "Helen" Li Electrical and Computer Engineering

Altmetric Attention Stats

Dimensions Citation Stats

Published In

ACM Journal on Emerging Technologies in Computing Systems

DOI

10.1145/3375699

EISSN

1550-4840

ISSN

1550-4832

Publication Date

January 29, 2020

Volume

Issue

Related Subject Headings

Computer Hardware & Architecture
4606 Distributed computing and systems software
1007 Nanotechnology
1006 Computer Hardware
0906 Electrical and Electronic Engineering

Citation

APA

Chicago

ICMJE

MLA

NLM

Li, B., Doppa, J. R., Pande, P. P., Chakrabarty, K., Qiu, J. X., & Li, H. H. (2020). 3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks. ACM Journal on Emerging Technologies in Computing Systems, 16(2). https://doi.org/10.1145/3375699

Li, B., J. R. Doppa, P. P. Pande, K. Chakrabarty, J. X. Qiu, and H. H. Li. “3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks.” ACM Journal on Emerging Technologies in Computing Systems 16, no. 2 (January 29, 2020). https://doi.org/10.1145/3375699.

Li B, Doppa JR, Pande PP, Chakrabarty K, Qiu JX, Li HH. 3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks. ACM Journal on Emerging Technologies in Computing Systems. 2020 Jan 29;16(2).

Li, B., et al. “3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks.” ACM Journal on Emerging Technologies in Computing Systems, vol. 16, no. 2, Jan. 2020. Scopus, doi:10.1145/3375699.

Published In

ACM Journal on Emerging Technologies in Computing Systems

DOI

10.1145/3375699

EISSN

1550-4840

ISSN

1550-4832

Publication Date

January 29, 2020

Volume

Issue

Related Subject Headings

Computer Hardware & Architecture
4606 Distributed computing and systems software
1007 Nanotechnology
1006 Computer Hardware
0906 Electrical and Electronic Engineering