The first presentation at the recent Critical Facilities Roundtable was by Phil Reese of Stanford University, who discussed their efforts to measure PUE and improve their data center energy efficiency.
Phil Reese and his presentation
PUE has been discussed in many places, so I was not sure whether another discussion would give me any new information. Actually, I enjoyed Phil’s presentation because, unlike giant data centers built from scratch using the most modern technologies by Google, Amazon, Microsoft, and others, his data centers are much more familiar examples to the many in the audience who need to improve energy efficiency while their data centers are in operation.
As you know, PUE is used to measure the progress of energy efficiency in a data center and to estimate the entire power cost by factoring in the cooling and other facilities-related costs.
Phil showed us interesting sets of data used to compute PUE at one of Stanford’s data centers (Tier II), located in Forsythe Hall. You will find information on the IT services, including data centers, here. The data center was constructed in the late 1970s with mainframe computers. Those were replaced a few years ago by rows of racks of computers—15 to 20 rows with 30 racks each. Initially, the per-rack power requirement was about 2 kW, there was no hot-aisle/cold-aisle implementation to control air flow, and the backs of computers in one row were directly facing the computer air intake of the adjacent row. This was corrected later by implementing hot aisle/cold aisle, but it was hard to rearrange the racks while the servers were operating.
|Guidance for Calculation of Efficiency (PUE) in Data Centers|
Recently, they got help from PG&E and consultants to improve their energy efficiency. One recommendation was to measure relevant data for improvement. The measurement they had was not automated, and the collection process or the quality of the collected data was not good enough. They installed wireless sensors by SynapSense and started measuring. As the graph below shows, PUE was between 1.4 and 1.5. Since the industry standard is 2.0, that was not too bad.
When Phil summarized his power use during the month of December, he realized the following:
- There were few changes in IT power consumption from day to day. (The top line in the graph is shown below.)
- There was no IT power consumption change when the school shut down on December 17, when most people left for vacation.
If the data center in question was like the one at Symantec (one of their data centers is accessed by Symantec divisions worldwide), the IT load would stay constant throughout the year, day and night. Phil tried to solve this mystery and asked many people for their opinion. Most people could not give him any clue, but Jon Koomey did. It turned out that the IT load was only slightly more than the idle stage of most servers. As is well known, a server in an idle stage can consume 80–85% of the energy of the fully loaded server. For that reason, it did not matter whether there were any loads or not.
If this was a data center in a nonacademic environment, heavy virtualization would be introduced and server consolidation would be encouraged to increase the load factor of each server. However, this is an academic institution where each server may be owned by a separate research team and consolidating or refreshing servers may not be possible because specific configurations may be required for each research team.
In any event, Phil concluded as follows.
After his presentation, lively discussions took place, with these highlights:
- Server energy-saving mode: Until several years ago, there was little power-saving mode available for most servers, and they did not go into sleep mode even when the IT load was low. This could be resolved by refreshing servers and other IT equipment. It may be harder to do that in an academic setting, where each research team may have financial and research constraints, or in a colocation environment, where different tenants have different situations, than in an enterprise data center.
- Virtualization: Virtualization was not used at all, and server utilization was very low. With active virtualization, computation images could be consolidated to fewer machines, and other servers could be turned off or removed altogether. Server shutdown faces a lot of opposition from IT people because servers may not reboot after being shut down. Another new problem with virtualization is that servers with high utilization give off more heat and make cooling more difficult on the data center floor. Companies like Power Assure tackle this problem.
- IT-focused energy efficiency metrics: Metrics on facilities are easier to measure because data can be collected without analyzing too much. Power consumed by cooling is the power data we need. However, IT energy efficiency is much harder to measure. Simply measuring the power consumed by IT equipment like servers does not give us an accurate view of IT energy efficiency, because we can run a bunch of idle servers without producing useful work. The Green Grid’s DCeP is the right metric to account for IT productivity, but it is very hard to measure it. I was contacted by a Japanese researcher who is pushing DPPE for adoption, and he asked me why the IT consideration was not high in the definition of efficiency metrics. My answer to him was that it is hard to measure IT energy efficiency objectively.