Skip to main content

FP-SMR: A Fully Digital Floating-Point Processing-in-SAS-MRAM for Session-based Recommender System

Publication ,  Conference
Ali, AH; Sridharan, A; Guo, C; Hwang, W; Tsai, W; Zhang, J; Chen, Y; X. Wang, S; Fan, D
Published in: Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi
June 29, 2025

With the rapid advancement of DNNs, numerous Process-in-Memory (PIM) architectures based on various memory technologies (Non-Volatile (NVM)/Volatile Memory) have been developed to accelerate AI workloads. Magnetic Random Access Memory (MRAM) is highly promising among NVMs due to its zero standby leakage, fast write/read speeds, CMOS compatibility, and high memory density. However, existing MRAM technologies such as spin-transfer torque MRAM (STT-MRAM) and spin-orbit torque MRAM (SOT-MRAM), have inherent limitations. STT-MRAM faces high write current requirements, while SOT-MRAM introduces significant area overhead due to additional access transistors. The new STT-assisted-SOT (SAS) MRAM provides an area-efficient alternative by sharing one write access transistor for multiple magnetic tunnel junctions (MTJs). This work presents the first fully digital processing-in-SAS-MRAM system to enable 8-bit floating-point (FP8) neural network inference with an application in on-device session-based recommender system. A SAS-MRAM device prototype is fabricated with 4 MTJs sharing the same SOT metal line. The proposed SAS-MRAM-based PIM macro is designed in TSMC 28nm technology. It achieves 15.31 TOPS/W energy efficiency and 269 GOPS performance for FP8 operations at 700 MHz. Compared to state-of-the-art recommender systems for the same popular YooChoose dataset, it demonstrates a 86 ×, 1.8 ×, and 1.12 × higher energy efficiency than that of GPU, SRAM-PIM, and ReRAM-PIM, respectively.

Duke Scholars

Published In

Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi

DOI

Publication Date

June 29, 2025

Start / End Page

341 / 347
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Ali, A. H., Sridharan, A., Guo, C., Hwang, W., Tsai, W., Zhang, J., … Fan, D. (2025). FP-SMR: A Fully Digital Floating-Point Processing-in-SAS-MRAM for Session-based Recommender System. In Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi (pp. 341–347). https://doi.org/10.1145/3716368.3735206
Ali, A. H., A. Sridharan, C. Guo, W. Hwang, W. Tsai, J. Zhang, Y. Chen, S. X. Wang, and D. Fan. “FP-SMR: A Fully Digital Floating-Point Processing-in-SAS-MRAM for Session-based Recommender System.” In Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi, 341–47, 2025. https://doi.org/10.1145/3716368.3735206.
Ali AH, Sridharan A, Guo C, Hwang W, Tsai W, Zhang J, et al. FP-SMR: A Fully Digital Floating-Point Processing-in-SAS-MRAM for Session-based Recommender System. In: Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi. 2025. p. 341–7.
Ali, A. H., et al. “FP-SMR: A Fully Digital Floating-Point Processing-in-SAS-MRAM for Session-based Recommender System.” Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi, 2025, pp. 341–47. Scopus, doi:10.1145/3716368.3735206.
Ali AH, Sridharan A, Guo C, Hwang W, Tsai W, Zhang J, Chen Y, X. Wang S, Fan D. FP-SMR: A Fully Digital Floating-Point Processing-in-SAS-MRAM for Session-based Recommender System. Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi. 2025. p. 341–347.

Published In

Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi

DOI

Publication Date

June 29, 2025

Start / End Page

341 / 347