Energy analysis of hardware and software range partitioning
Data partitioning is a critical operation for manipulating large datasets because it subdivides tasks into pieces that are more amenable to efficient processing. It is often the limiting factor in database performance and represents a significant fraction of the overall runtime of large data queries. This article measures the performance and energy of state-of-the-art software partitioners, and describes and evaluates a hardware range partitioner that further improves efficiency. The software implementation is broken into two phases, allowing separate analysis of the partition function computation and data shuffling costs. Although range partitioning is commonly thought to be more expensive than simpler strategies such as hash partitioning, our measurements indicate that careful data movement and optimization of the partition function can allow it to approach the throughput and energy consumption of hash or radix partitioning. For further acceleration, we describe a hardware range partitioner, or HARP, a streaming framework that offers a seamless execution environment for this and other streaming accelerators, and a detailed analysis of a 32nm physical design that matches the throughput of four to eight software threads while consuming just 6.9% of the area and 4.3% of the power of a Xeon core in the same technology generation.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Related Subject Headings
- Networking & Telecommunications
- 4606 Distributed computing and systems software
- 1006 Computer Hardware
- 0806 Information Systems
- 0803 Computer Software
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Related Subject Headings
- Networking & Telecommunications
- 4606 Distributed computing and systems software
- 1006 Computer Hardware
- 0806 Information Systems
- 0803 Computer Software