The proliferation of large High-Performance Computing clusters executing computation-intensive jobs on large data sets has made cluster power proportionality very important. Despite publicly available traces showing that many clusters have a low average utilization, existing power-proportionality techniques have seen low adoption, a major reason being that these techniques require modifications to the existing cluster software and network stack, and do not address the reliability concerns that may arise during the course of server power-cycling.
We present Hypnos, a defensive power proportionality system which is unobtrusive, extensible and gracefully handles possible server software and hardware failures which may occur during server power-cycling. We deployed Hypnos on a 57-server production cluster. From a 21-day run, we obtained a 36% energy saving in spite of multiple server and network failures.
Hypnos: Unobtrusive Power Proportionality for HPC frameworks
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).