ESSENCE: Exploiting Structured Stochastic Gradient Pruning for Endurance-Aware ReRAM-Based In-Memory Training Systems
Processing-in-memory (PIM) enables energy-efficient deployment of convolutional neural networks (CNNs) from edge to cloud. Resistive random-access memory (ReRAM) is one of the most commonly used technologies for PIM architectures. One of the primary limitations of ReRAM-based PIM in neural network training arises from the limited write endurance due to the frequent weight updates. To make ReRAM-based architectures viable for CNN training, the write endurance issue needs to be addressed. This work aims to reduce the number of weight reprogrammings without compromising the final model accuracy. We propose the ESSENCE framework with an endurance-aware structured stochastic gradient pruning method, which dynamically adjusts the probability of gradient update based on the current update counts. Experimental results with multiple CNNs and datasets demonstrate that the proposed method can extend ReRAM's life time for training. For instance, with the ResNet20 network and CIFAR-10 dataset, ESSENCE can save the mean update counts of up to 10.29× compared to the stochastic gradient descent method and effectively reduce the maximum update counts compared with the No Endurance method. Furthermore, an aggressive tuning method based on ESSENCE can boost the mean update count savings by up to 14.41×.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Computer Hardware & Architecture
- 4607 Graphics, augmented reality and games
- 4009 Electronics, sensors and digital hardware
- 1006 Computer Hardware
- 0906 Electrical and Electronic Engineering
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Computer Hardware & Architecture
- 4607 Graphics, augmented reality and games
- 4009 Electronics, sensors and digital hardware
- 1006 Computer Hardware
- 0906 Electrical and Electronic Engineering