Skip to main content

EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference

Publication ,  Journal Article
Wang, Y; Li, S; Zheng, Q; Chang, A; Li, H; Chen, Y
Published in: ACM Transactions on Embedded Computing Systems
September 9, 2023

Recommendation systems have been widely embedded into many Internet services. For example, Meta's deep learning recommendation model (DLRM) shows high prefictive accuracy of click-through rate in processing large-scale embedding tables. The SparseLengthSum (SLS) kernel of the DLRM dominates the inference time of the DLRM due to intensive irregular memory accesses to the embedding vectors. Some prior works directly adopt near data processing (NDP) solutions to obtain higher memory bandwidth to accelerate SLS. However, their inferior memory hierarchy induces low performance-cost ratio and fails to fully exploit the data locality. Although some software-managed cache policies were proposed to improve the cache hit rate, the incurred cache miss penalty is unacceptable considering the high overheads of executing the corresponding programs and the communication between the host and the accelerator. To address the issues aforementioned, we propose EMS-i, an efficient memory system design that integrates Solide State Drive (SSD) into the memory hierarchy using Compute Express Link (CXL) for recommendation system inference. We specialize the caching mechanism according to the characteristics of various DLRM workloads and propose a novel prefetching mechanism to further improve the performance. In addition, we delicately design the inference kernel and develop a customized mapping scheme for SLS operation, considering the multi-level parallelism in SLS and the data locality within a batch of queries. Compared to the state-of-the-art NDP solutions, EMS-i achieves up to 10.9× speedup over RecSSD and the performance comparable to RecNMP with 72% energy savings. EMS-i also saves up to 8.7× and 6.6 × memory cost w.r.t. RecSSD and RecNMP, respectively.

Duke Scholars

Published In

ACM Transactions on Embedded Computing Systems

DOI

EISSN

1558-3465

ISSN

1539-9087

Publication Date

September 9, 2023

Volume

22

Issue

5 s

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 4006 Communications engineering
  • 1006 Computer Hardware
  • 0805 Distributed Computing
  • 0803 Computer Software
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, Y., Li, S., Zheng, Q., Chang, A., Li, H., & Chen, Y. (2023). EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference. ACM Transactions on Embedded Computing Systems, 22(5 s). https://doi.org/10.1145/3609384
Wang, Y., S. Li, Q. Zheng, A. Chang, H. Li, and Y. Chen. “EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference.” ACM Transactions on Embedded Computing Systems 22, no. 5 s (September 9, 2023). https://doi.org/10.1145/3609384.
Wang Y, Li S, Zheng Q, Chang A, Li H, Chen Y. EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference. ACM Transactions on Embedded Computing Systems. 2023 Sep 9;22(5 s).
Wang, Y., et al. “EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference.” ACM Transactions on Embedded Computing Systems, vol. 22, no. 5 s, Sept. 2023. Scopus, doi:10.1145/3609384.
Wang Y, Li S, Zheng Q, Chang A, Li H, Chen Y. EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference. ACM Transactions on Embedded Computing Systems. 2023 Sep 9;22(5 s).

Published In

ACM Transactions on Embedded Computing Systems

DOI

EISSN

1558-3465

ISSN

1539-9087

Publication Date

September 9, 2023

Volume

22

Issue

5 s

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 4006 Communications engineering
  • 1006 Computer Hardware
  • 0805 Distributed Computing
  • 0803 Computer Software