Skip to main content

NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models

Publication ,  Journal Article
Li, S; Wang, Y; Hanson, E; Chang, A; Seok Ki, Y; Li, H; Chen, Y
Published in: IEEE Transactions on Computers
May 1, 2024

Recent advances in deep neural networks (DNNs) have enabled highly effective recommendation models for diverse web services. In such DNN-based recommendation models, the embedding layer comprises the majority of model parameters. As these models scale rapidly, the embedding layer's memory capacity and bandwidth requirements threaten to exceed the limits of current computing architectures. We observe the embedding layer's computational demands increase much more slowly than its storage needs, suggesting an opportunity to offload embeddings to storage hardware. In this work, we present NDRec, a near-data processing system to train large-scale recommendation models. NDRec offloads both the parameters and the computation of the embedding layer to computational storage devices (CSDs), using coherence interconnects (CXLs) for communication between GPUs and CSDs. By leveraging the statistical properties of embedding access patterns, we develop an optimized CSD memory hierarchy and caching strategy. A lookahead embedding scheme enables concurrent execution of embeddings and other operations, hiding latency and reducing memory bandwidth requirements. We evaluate NDRec using real-world and synthetic benchmarks. Results demonstrate NDRec achieves up to 4.33 × and 3.97× speedups over heterogeneous CPU-GPU platforms and GPU caching, respectively. NDRec also reduces per-iteration energy consumption by up to 54.9%.

Duke Scholars

Published In

IEEE Transactions on Computers

DOI

EISSN

1557-9956

ISSN

0018-9340

Publication Date

May 1, 2024

Volume

73

Issue

5

Start / End Page

1248 / 1261

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 4009 Electronics, sensors and digital hardware
  • 1006 Computer Hardware
  • 0805 Distributed Computing
  • 0803 Computer Software
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Li, S., Wang, Y., Hanson, E., Chang, A., Seok Ki, Y., Li, H., & Chen, Y. (2024). NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models. IEEE Transactions on Computers, 73(5), 1248–1261. https://doi.org/10.1109/TC.2024.3365939
Li, S., Y. Wang, E. Hanson, A. Chang, Y. Seok Ki, H. Li, and Y. Chen. “NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models.” IEEE Transactions on Computers 73, no. 5 (May 1, 2024): 1248–61. https://doi.org/10.1109/TC.2024.3365939.
Li S, Wang Y, Hanson E, Chang A, Seok Ki Y, Li H, et al. NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models. IEEE Transactions on Computers. 2024 May 1;73(5):1248–61.
Li, S., et al. “NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models.” IEEE Transactions on Computers, vol. 73, no. 5, May 2024, pp. 1248–61. Scopus, doi:10.1109/TC.2024.3365939.
Li S, Wang Y, Hanson E, Chang A, Seok Ki Y, Li H, Chen Y. NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models. IEEE Transactions on Computers. 2024 May 1;73(5):1248–1261.

Published In

IEEE Transactions on Computers

DOI

EISSN

1557-9956

ISSN

0018-9340

Publication Date

May 1, 2024

Volume

73

Issue

5

Start / End Page

1248 / 1261

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 4009 Electronics, sensors and digital hardware
  • 1006 Computer Hardware
  • 0805 Distributed Computing
  • 0803 Computer Software