Skip to main content

TEMP: Thread batch enabled memory partitioning for GPU

Publication ,  Conference
Mao, M; Wen, W; Liu, X; Hu, J; Wang, D; Chen, Y; Li, H
Published in: Proceedings - Design Automation Conference
June 5, 2016

As massive multi-threading in GPU imposes tremendous pressure on memory subsystems, efficient bandwidth utilization becomes a key factor affecting the GPU throughput. In this work, we propose thread batch enabled memory partitioning (TEMP), to improve GPU performance through the improvement of memory bandwidth utilization. In particular, TEMP clusters multiple thread blocks sharing the same set of pages into a thread batch and dispatches the entire thread batch to a stream multiprocessor. TEMP separates the memory access streams of different thread batches by OS memory management, preserving the intrinsic locality of thread batches and increasing the memory access parallelism. Experimental results show that TEMP can obtain up to 10.3% performance improvement and 14.6% DRAM energy reduction compared to a state-of-the-art scheduler without any memory-side optimizations.

Duke Scholars

Published In

Proceedings - Design Automation Conference

DOI

ISSN

0738-100X

Publication Date

June 5, 2016

Volume

05-09-June-2016
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Mao, M., Wen, W., Liu, X., Hu, J., Wang, D., Chen, Y., & Li, H. (2016). TEMP: Thread batch enabled memory partitioning for GPU. In Proceedings - Design Automation Conference (Vol. 05-09-June-2016). https://doi.org/10.1145/2897937.2898103
Mao, M., W. Wen, X. Liu, J. Hu, D. Wang, Y. Chen, and H. Li. “TEMP: Thread batch enabled memory partitioning for GPU.” In Proceedings - Design Automation Conference, Vol. 05-09-June-2016, 2016. https://doi.org/10.1145/2897937.2898103.
Mao M, Wen W, Liu X, Hu J, Wang D, Chen Y, et al. TEMP: Thread batch enabled memory partitioning for GPU. In: Proceedings - Design Automation Conference. 2016.
Mao, M., et al. “TEMP: Thread batch enabled memory partitioning for GPU.” Proceedings - Design Automation Conference, vol. 05-09-June-2016, 2016. Scopus, doi:10.1145/2897937.2898103.
Mao M, Wen W, Liu X, Hu J, Wang D, Chen Y, Li H. TEMP: Thread batch enabled memory partitioning for GPU. Proceedings - Design Automation Conference. 2016.

Published In

Proceedings - Design Automation Conference

DOI

ISSN

0738-100X

Publication Date

June 5, 2016

Volume

05-09-June-2016