Predicting output performance of a petascale supercomputer

Published

Conference Paper

© 2017 Association for Computing Machinery. In this paper, we develop a predictive model useful for output performance prediction of supercomputer file systems under production load. Our target environment is Titan-the 3rd fastest supercomputer in the world-and its Lustre-based multi-stage write path. We observe from Titan that although output performance is highly variable at small time scales, the mean performance is stable and consistent over typical application run times. Moreover, we find that output performance is non-linearly related to its correlated parameters due to interference and saturation on individual stages on the path. These observations enable us to build a predictive model of expected write times of output patterns and I/O configurations, using feature transformations to capture non-linear relationships. We identify the candidate features based on the structure of the Lustre/Titan write path, and use feature transformation functions to produce a model space with 135,000 candidate models. By searching for the minimal mean square error in this space we identify a good model and show that it is effective.

Full Text

Duke Authors

Cited Authors

  • Xie, B; Huang, Y; Chase, JS; Choi, JY; Klasky, S; Lofstead, J; Oral, S

Published Date

  • June 26, 2017

Published In

  • Hpdc 2017 Proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing

Start / End Page

  • 181 - 192

International Standard Book Number 13 (ISBN-13)

  • 9781450346993

Digital Object Identifier (DOI)

  • 10.1145/3078597.3078614

Citation Source

  • Scopus