Weatherman: Automated, online, and predictive thermal mapping and management for data centers

Published

Journal Article

Recent advances have demonstrated the potential benefits of coordinated management of thermal load in data centers, including reduced cooling costs and improved resistance to cooling system failures. A key unresolved obstacle to the practical implementation of thermal load management is the ability to predict the effects of workload distribution and cooling configurations on temperatures within a data center enclosure. The interactions between workload, cooling, and temperature are dependent on complex factors that are unique to each data center, including physical room layout, hardware power consumption, and cooling capacity; this dictates an approach that formulates management policies for each data center based on these properties. We propose and evaluate a simple, flexible method to infer a detailed model of thermal behavior within a data center from a stream of instrumentation data. This data - taken during normal data center operation - includes continuous readings taken from external temperature sensors, server instrumentation, and computer room air conditioning units. Experimental results from a representative data center show that automatic thermal mapping can predict accurately the heat distribution resulting from a given workload distribution and cooling configuration, thereby removing the need for static or manual configuration of thermal load management systems. We also demonstrate how our approach adapts to preserve accuracy across changes to cluster attributes that affect thermal behavior -such as cooling settings, workload distribution, and power consumption. © 2006 IEEE.

Duke Authors

Cited Authors

  • Moore, J; Chase, JS; Ranganathan, P

Published Date

  • December 1, 2006

Published In

  • Proceedings 3rd International Conference on Autonomic Computing, Icac 2006

Volume / Issue

  • 2006 /

Start / End Page

  • 155 - 164

Citation Source

  • Scopus