CPU-GPU system designs for high performance cloud computing
Improvement of parallel computing capability will greatly increase the efficiency of high performance cloud computing. By combining the powerful scalar processing on CPU with the efficient parallel processing on GPU, CPU-GPU systems provide a hybrid computing environment that can be dynamically optimized for cloud computing applications. One of the critical issues in CPU-GPU system designs is the so called memory wall, which denotes the design complexity of memory coherence, bandwidth, capacity, and power budget. The optimization of the memory designs can not only improve the run-time performance but also enhance the reliability of the CPU-GPU system. In this chapter, we will introduce the mainstream and emerging memory hierarchy designs in CPU-GPU systems, discuss the techniques that can optimize the data allocation and migration between CPU and GPU for performance and power efficiency improvement, and present the challenges and opportunities of CPU-GPU systems.