Scholars@Duke publication: Starfish: A self-tuning system for big data analytics

Starfish: A self-tuning system for big data analytics

Publication , Journal Article

Herodotou, H; Lim, H; Luo, G; Borisov, N; Dong, L; Cetin, FB; Babu, S

Published in: CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings

October 11, 2011

Timely and cost-effective analytics over "Big Data" is now a key ingredient for success in many businesses, scientific and engineering disciplines, and government endeavors. The Hadoop software stack-which consists of an extensible MapReduce execution engine, pluggable distributed storage engines, and a range of procedural to declarative interfaces-is a popular choice for big data analytics. Most practitioners of big data analytics-like computational scientists, systems researchers, and business analysts-lack the expertise to tune the system to get good performance. Unfortunately, Hadoop's performance out of the box leaves much to be desired, leading to suboptimal use of resources, time, and money (in payas-you-go clouds). We introduce Starfish, a self-tuning system for big data analytics. Starfish builds on Hadoop while adapting to user needs and system workloads to provide good performance automatically, without any need for users to understand and manipulate the many tuning knobs in Hadoop. While Starfish's system architecture is guided by work on self-tuning database systems, we discuss how new analysis practices over big data pose new challenges; leading us to different design choices in Starfish.

Duke Scholars

Author Shivnath Babu Computer Science

Published In

CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings

Publication Date

October 11, 2011

Start / End Page

261 / 272

Citation

APA

Chicago

ICMJE

MLA

NLM

Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F. B., & Babu, S. (2011). Starfish: A self-tuning system for big data analytics. CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings, 261–272.

Herodotou, H., H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu. “Starfish: A self-tuning system for big data analytics.” CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings, October 11, 2011, 261–72.

Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin FB, et al. Starfish: A self-tuning system for big data analytics. CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings. 2011 Oct 11;261–72.

Herodotou, H., et al. “Starfish: A self-tuning system for big data analytics.” CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings, Oct. 2011, pp. 261–72.

Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin FB, Babu S. Starfish: A self-tuning system for big data analytics. CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings. 2011 Oct 11;261–272.

Published In

CIDR 2011 - 5th Biennial Conference on Innovative Data Systems Research, Conference Proceedings

Publication Date

October 11, 2011

Start / End Page

261 / 272