Scholars@Duke publication: Tolerating Memory Latency through Push Prefetching for Pointer-Intensive Applications

Tolerating Memory Latency through Push Prefetching for Pointer-Intensive Applications

Publication , Journal Article

Yang, CL; Tseng, HW; Lee, CH; Lebeck, AR

Published in: ACM Transactions on Architecture and Code Optimization

January 1, 2004

Prefetching is often used to overlap memory latency with computation for array-based applications. However, prefetching for pointer-intensive applications remains a challenge because of the irregular memory access pattern and pointer-chasing problem. In this paper, we proposed a cooperative hardware/software prefetching framework, the push architecture, which is designed specifically for linked data structures. The push architecture exploits program structure for future address generation instead of relying on past address history. It identifies the load instructions that traverse a LDS and uses a prefetch engine to execute them ahead of the CPU execution. This allows the prefetch engine to successfully generate future addresses. To overcome the serial nature of LDS address generation, the push architecture employs a novel data movement model. It attaches the prefetch engine to each level of the memory hierarchy and pushes, rather than pulls, data to the CPU. This push model decouples the pointer dereference from the transfer of the current node up to the processor. Thus a series of pointer dereferences becomes a pipelined process rather than a serial process. Simulation results show that the push architecture can reduce up to 100% of memory stall time on a suite of pointer-intensive applications, reducing overall execution time by an average 15%. © 2004, ACM. All rights reserved.

Duke Scholars

Author Alvin R. Lebeck Computer Science

Altmetric Attention Stats

Dimensions Citation Stats

Published In

ACM Transactions on Architecture and Code Optimization

DOI

10.1145/1044823.1044827

EISSN

1544-3973

ISSN

1544-3566

Publication Date

January 1, 2004

Volume

Issue

Start / End Page

445 / 475

Related Subject Headings

4606 Distributed computing and systems software
4009 Electronics, sensors and digital hardware
0906 Electrical and Electronic Engineering
0803 Computer Software

Citation

APA

Chicago

ICMJE

MLA

NLM

Yang, C. L., Tseng, H. W., Lee, C. H., & Lebeck, A. R. (2004). Tolerating Memory Latency through Push Prefetching for Pointer-Intensive Applications. ACM Transactions on Architecture and Code Optimization, 1(4), 445–475. https://doi.org/10.1145/1044823.1044827

Yang, C. L., H. W. Tseng, C. H. Lee, and A. R. Lebeck. “Tolerating Memory Latency through Push Prefetching for Pointer-Intensive Applications.” ACM Transactions on Architecture and Code Optimization 1, no. 4 (January 1, 2004): 445–75. https://doi.org/10.1145/1044823.1044827.

Yang CL, Tseng HW, Lee CH, Lebeck AR. Tolerating Memory Latency through Push Prefetching for Pointer-Intensive Applications. ACM Transactions on Architecture and Code Optimization. 2004 Jan 1;1(4):445–75.

Yang, C. L., et al. “Tolerating Memory Latency through Push Prefetching for Pointer-Intensive Applications.” ACM Transactions on Architecture and Code Optimization, vol. 1, no. 4, Jan. 2004, pp. 445–75. Scopus, doi:10.1145/1044823.1044827.

Published In

ACM Transactions on Architecture and Code Optimization

DOI

10.1145/1044823.1044827

EISSN

1544-3973

ISSN

1544-3566

Publication Date

January 1, 2004

Volume

Issue

Start / End Page

445 / 475

Related Subject Headings

4606 Distributed computing and systems software
4009 Electronics, sensors and digital hardware
0906 Electrical and Electronic Engineering
0803 Computer Software