Skip to main content

VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications

Publication ,  Conference
Mao, M; Hu, J; Chen, Y; Li, H
Published in: Proceedings - Design Automation Conference
July 24, 2015

Massive multi-threading of GPGPU demands for efficient usage of caches with limited capacity. In this work, we propose a versatile warp scheduler (VWS) to reduce the cache miss rate in GPGPU. VWS retains the intra-warp cache locality using an efficient per-warp working set estimator and enhances intra-/inter-cooperative thread array (CTA) cache locality through imposing a CTA-aware scheduling policy and a new CTA dispatching mechanism. The significantly improved hit rate of cache hierarchy enables VWS to achieve on average 38.4% and 9.3% IPC improvement across diverse GPGPU applications compared to a widely-used and a state-of-the-art warp schedulers, respectively.

Duke Scholars

Published In

Proceedings - Design Automation Conference

DOI

ISSN

0738-100X

Publication Date

July 24, 2015

Volume

2015-July
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Mao, M., Hu, J., Chen, Y., & Li, H. (2015). VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications. In Proceedings - Design Automation Conference (Vol. 2015-July). https://doi.org/10.1145/2744769.2744931
Mao, M., J. Hu, Y. Chen, and H. Li. “VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications.” In Proceedings - Design Automation Conference, Vol. 2015-July, 2015. https://doi.org/10.1145/2744769.2744931.
Mao M, Hu J, Chen Y, Li H. VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications. In: Proceedings - Design Automation Conference. 2015.
Mao, M., et al. “VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications.” Proceedings - Design Automation Conference, vol. 2015-July, 2015. Scopus, doi:10.1145/2744769.2744931.
Mao M, Hu J, Chen Y, Li H. VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications. Proceedings - Design Automation Conference. 2015.

Published In

Proceedings - Design Automation Conference

DOI

ISSN

0738-100X

Publication Date

July 24, 2015

Volume

2015-July