Skip to main content
Journal cover image

An efficient strategy for the collection and storage of large volumes of data for computation

Publication ,  Journal Article
Suthakar, U; Magnoni, L; Smith, DR; Khan, A; Andreeva, J
Published in: Journal of Big Data
December 1, 2016

In recent years, there has been an increasing amount of data being produced and stored, which is known as Big Data. The social networks, internet of things, scientific experiments and commercial services play a significant role in generating a vast amount of data. Three main factors are important in Big Data; Volume, Velocity and Variety. One needs to consider all three factors when designing a platform to support Big Data. The Large Hadron Collider (LHC) particle accelerator at CERN consists of a number of data-intensive experiments, which are estimated to produce a volume of about 30 PB of data, annually. The velocity of these data that are propagated will be extremely fast. Traditional methods of collecting, storing and analysing data have become insufficient in managing the rapidly growing volume of data. Therefore, it is essential to have an efficient strategy to capture these data as they are produced. In this paper, a number of models are explored to understand what should be the best approach for collecting and storing Big Data for analytics. An evaluation of the performance of full execution cycles of these approaches on the monitoring of the Worldwide LHC Computing Grid (WLCG) infrastructure for collecting, storing and analysing data is presented. Moreover, the models discussed are applied to a community driven software solution, Apache Flume, to show how they can be integrated, seamlessly.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Journal of Big Data

DOI

EISSN

2196-1115

Publication Date

December 1, 2016

Volume

3

Issue

1

Related Subject Headings

  • 08 Information and Computing Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Suthakar, U., Magnoni, L., Smith, D. R., Khan, A., & Andreeva, J. (2016). An efficient strategy for the collection and storage of large volumes of data for computation. Journal of Big Data, 3(1). https://doi.org/10.1186/s40537-016-0056-1
Suthakar, U., L. Magnoni, D. R. Smith, A. Khan, and J. Andreeva. “An efficient strategy for the collection and storage of large volumes of data for computation.” Journal of Big Data 3, no. 1 (December 1, 2016). https://doi.org/10.1186/s40537-016-0056-1.
Suthakar U, Magnoni L, Smith DR, Khan A, Andreeva J. An efficient strategy for the collection and storage of large volumes of data for computation. Journal of Big Data. 2016 Dec 1;3(1).
Suthakar, U., et al. “An efficient strategy for the collection and storage of large volumes of data for computation.” Journal of Big Data, vol. 3, no. 1, Dec. 2016. Scopus, doi:10.1186/s40537-016-0056-1.
Suthakar U, Magnoni L, Smith DR, Khan A, Andreeva J. An efficient strategy for the collection and storage of large volumes of data for computation. Journal of Big Data. 2016 Dec 1;3(1).
Journal cover image

Published In

Journal of Big Data

DOI

EISSN

2196-1115

Publication Date

December 1, 2016

Volume

3

Issue

1

Related Subject Headings

  • 08 Information and Computing Sciences