Skip to main content

Cumulon: Optimizing statistical data analysis in the cloud

Publication ,  Journal Article
Huang, B; Babu, S; Yang, J
Published in: Proceedings of the ACM SIGMOD International Conference on Management of Data
July 29, 2013

We present Cumulon, a system designed to help users rapidly develop and intelligently deploy matrix-based big-data analysis programs in the cloud. Cumulon features a flexible execution model and new operators especially suited for such workloads. We show how to implement Cumulon on top of Hadoop/HDFS while avoiding limitations of MapReduce, and demonstrate Cumulon's performance advantages over existing Hadoop-based systems for statistical data analysis. To support intelligent deployment in the cloud according to time/budget constraints, Cumulon goes beyond database-style optimization to make choices automatically on not only physical operators and their parameters, but also hardware provisioning and configuration settings. We apply a suite of benchmarking, simulation, modeling, and search techniques to support effective cost-based optimization over this rich space of deployment plans. Copyright © 2013 ACM.

Duke Scholars

Published In

Proceedings of the ACM SIGMOD International Conference on Management of Data

DOI

ISSN

0730-8078

Publication Date

July 29, 2013

Start / End Page

1 / 12
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Huang, B., Babu, S., & Yang, J. (2013). Cumulon: Optimizing statistical data analysis in the cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1–12. https://doi.org/10.1145/2463676.2465273
Huang, B., S. Babu, and J. Yang. “Cumulon: Optimizing statistical data analysis in the cloud.” Proceedings of the ACM SIGMOD International Conference on Management of Data, July 29, 2013, 1–12. https://doi.org/10.1145/2463676.2465273.
Huang B, Babu S, Yang J. Cumulon: Optimizing statistical data analysis in the cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013 Jul 29;1–12.
Huang, B., et al. “Cumulon: Optimizing statistical data analysis in the cloud.” Proceedings of the ACM SIGMOD International Conference on Management of Data, July 2013, pp. 1–12. Scopus, doi:10.1145/2463676.2465273.
Huang B, Babu S, Yang J. Cumulon: Optimizing statistical data analysis in the cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013 Jul 29;1–12.

Published In

Proceedings of the ACM SIGMOD International Conference on Management of Data

DOI

ISSN

0730-8078

Publication Date

July 29, 2013

Start / End Page

1 / 12