Rhythm: Harnessing data parallel hardware for server workloads
Trends in increasing web traffic demand an increase in server throughput while preserving energy efficiency and total cost of ownership. Present work in optimizing data center efficiency primarily focuses on the data center as a whole, using off-the-shelf hardware for individual servers. Server capacity is typically increased by adding more machines, which is cheap, though inefficient in the long run in terms of energy and area. Our work builds on the observation that server workload execution patterns are not completely unique across multiple requests. We present a framework - called Rhythm - for high throughput servers that can exploit similarity across requests to improve server performance and power/energy efficiency by launching data parallel executions for request cohorts. An implementation of the SPECWeb Banking workload using Rhythm on NVIDIA GPUs provides a basis for evaluating both software and hardware for future cohort-based servers. Our evaluation of Rhythm on future server platforms shows that it achieves 4× the throughput (reqs/sec) of a core i7 at efficiencies (reqs/Joule) comparable to a dual core ARM Cortex A9. A Rhythm implementation that generates transposed responses achieves 8× the i7 throughput while processing 2.5× more requests/Joule compared to the A9. Copyright © 2014 ACM.
Duke Scholars
Published In
DOI
Publication Date
Start / End Page
Related Subject Headings
- Software Engineering
Citation
Published In
DOI
Publication Date
Start / End Page
Related Subject Headings
- Software Engineering