A casual Twitter tweet about my power savings through the use of VMware Distributed Power Management (DPM) found its way to VMware Senior Product Manager for DPM, Ulana Legedza, and Andrei Dorofeev. Ulana was interested in learning more about my situation. I explained how VMware DPM had evaluated workloads between two clustered vSphere hosts in my home lab, and proceeded to shut down one of the hosts for most of the month of October, saving me more than $50 on my energy bill.
Ulana and Andrei took the conversation to the next level and asked me if I was using vSphere’s Advanced CPU Power Management feature (See vSphere Resource Management Guide page 22). I was not, in fact I was unaware of its existence. Power Management is a new feature in ESX(i)4 available to processors supporting Enhanced Intel SpeedStep or Enhanced AMD PowerNow! power management technologies. To quote the .PDF article:
“To improve CPU power efficiency, you can configure your ESX/ESXi hosts to dynamically switch CPU frequencies based on workload demands. This type of power management is called Dynamic Voltage and Frequency Scaling (DVFS). It uses processor performance states (P-states) made available to the VMkernel through an ACPI interface.”
A quick look at the Quad Core AMD Opteron 2356 processors in my HP DL385 G2 showed they support Enhanced AMD PowerNow! Power Management Technology:
There are two steps to enabling this power management feature. The first step is to ensure it is enabled in the server BIOS. On an HP DL385 G2, CPU power management is enabled by default. In this particular server model, it is configured via the BIOS by hitting <F9> at the end of the POST (would require a reboot obviously)
A slightly easier method might be to verify and/or configure the policy through HP’s out of band (OOB) iLO 2, however, a reboot will be requested by the iLO 2 for a policy change to take effect. On an HP server, configure for OS Control mode, but again, this appears to be the default for the HP DL385 G2 so hopefully no reboot is required for you to implement this power saving measure in your environment:
After enabling power management in the BIOS, the second step is to modify the Power Management Policy on each ESX(i) host from the default of static to dynamic. The definitions of these two settings can be found in the .PDF linked above and are as follows:
static – The default. The VMkernel can detect power management features available on the host but does not actively use them unless requested by the BIOS for power capping or thermal events.
dynamic – The VMkernel optimizes each CPU’s frequency to match demand in order to improve power efficiency but not affect performance. When CPU demand increases, this policy setting ensures that CPU frequencies also increase.
You might be asking yourself by this point “Ok, this is nice, but what’s the trade off?” Note the wording in the dynamic definition above “improves power efficiency but does not affect performance”. This is a win/win configuration change!
This step can be performed one of a few ways on each host (again, no reboot required for this change):
- Using the vSphere Client, change the Advanced host setting Power.CpuPolicy from static to dynamic
- Scriptable: Via the ESX service console, PuTTY, or script, issue the command esxcfg-advcfg -s dynamic /Power/CpuPolicy
The impact on my home lab was quite visible. After 12 hours, the blue area in the following 24 hour graph reflects average electrical consumption was reduced from an average 337 Watts down to 292 Watts. All things being equal and CPU loads balanced by DRS, that’s a reduction in energy consumption of over 13% per host:
An alternate graph shows Btu output dropped from 1,135 Btu to about 1,000 Btu. All things being equal, a reduction of about 135 Btu per host:
A Btu is heat – explained more at wiseGEEK’s What is a Btu? Heat is a byproduct of technology in the datacenter and in most cases is viewed as overhead expense because it requires cooling (additional costs) to maintain optimal operating conditions for the equipment running in the environment. If we can eliminate heat, we eliminate the associated cost of removing the heat. This is known as cost avoidance.
Eliminating heat is as much of an interest to me as reducing my energy bill. The excessive heat generated in the basement eventually finds its way upstairs causing the rest of the house to be a little uncomfortable. The air conditioner in my home wasn’t manufactured to handle the excessive heat. Now, I live in the midwest where we have some frigid winters. Heat in the home is welcomed during the winter months. I could turn off CPU Power Management raising the Btu index as well as my energy bill, in favor of reducing my natural gas heating bill. I don’t know which is more expensive. This could be a great experiment for the January/February time frame.
In summary, we can attack operating costs from two sides by using VMware CPU Power Management:
- Reduction in excess electricity used by idle CPU cycles
- Reduction in cooling costs by reducing Btu output
I’m excited to see what next month’s energy bill looks like.
Update 11-17-09: I was just made aware that Simon Seagrave wrote an earlier article on CPU power management here. Sorry Simon, I was unaware of your article and I did not intentionally copy your topic. Your article covered the topic well. I hope we’re still friends 🙂
Great post!!
It would be nice to see a post about how DPM fits in with this.
Thank you, Tom
Also a processor compatibility list would be nice.
If it is such a big power consumption improvement with no downside, I’m wondering why it isn’t the default configuration, or for that matter the only configuration.
I am somewhat skeptical. If this is a win/win why is it not enabled by default? What’s the justification for hiding this ‘feature’ in the advanced host settings? Good post all around. I sounds like a great feature to further reduce costs.
DPM is complementary to the host power optimizations described in this post. To maximize power savings, use both together. Think of it this way: If the load on a host is low enough such that the CPU frequency can be reduced without impacting VM performance, host power management will do that. And if the aggregate load across the cluster is so low that an ENTIRE host can be shut down without impacting VM performance, DPM will do that. So in a 3-host cluster, for example, you could have a situation where one host is completely powered off, and the other two hosts have reduced CPU frequency. So the result of using both features is more power savings than using each feature on its own.
@Alan: this feature is not enabled by default because we couldn’t run every possible workload out there to really prove that there are not regressions. We did test it on many different workloads with positive results. The reason why this is “hidden” is also because there wasn’t enough time to get all the required vSphere Client changes made before the release.
We are looking into making it much easier to change host power management policies in the future, so stay tuned 🙂
@Tom: Any Intel Core or later family processor which supports Enhanced SpeedStep, or any AMD processor (Barcelona, Greyhound and beyond) with Enhanced PowerNow! should be supported.
Intel Pentium 4 or AMD Opteron (K9) are not supported.
Jason –
I just tried this with a cluster of BL490c G6 Blade Servers. I was offered an additional (default) choice in iLO called “Dynamic Power Savings Mode” which is referenced here > http://h18000.www1.hp.com/products/servers/management/ilo/power-regulator.html
Also, when I changed from the default Dynamic Mode to OS Controlled, it informed me I need a reboot for the settings to take affect. Maybe you don’t need this with the AMD based?
I am going to try this mode for a while and see what happens. I am thinking it is better to let the hypervisor control this instead of the hardware guessing what to do…
Dave
@Dave: Depending on the processor type, you may not get as big power savings as Jason did on his system. Many new Intel processors have a feature called C1E when processor automatically drops its frequency to the lowest setting when all processor cores go idle. Latest processors from AMD (Shanghai or later) have a similar feature, so your results may vary. I agree that letting the hypervisor control power management is a better option. For once, the hypervisor can also take advantage of processor C-states but the firmware can’t do that really.
Nice Post! I must have one of those new fangled processors that Andrei mentions above. My whitebox ESX4i host only draws 100 watts at idle no matter if I select dynamic or static in the VI Client. I have an Intel Quad Core Q9300 processor installed.
Dave, you are correct. Changing the CPU power policy via the iLO does indeed prompt for a reboot. I have updated the blog post, thanks to your feedback.
Jason –
No thank YOU for the post! I jumped on it as soon as I saw the Tweet. Like I said, I am going to use a few tidbits from here for my discussion on GreenIT at the PAVMUG meeting on Thursday.
Dave
Great post Jason. Do you know that not all hypervizors are made equal when it comes to reduce power consumption. Servers running Windows Server 2008 R2 with Hyper-V may encounter a blue screen aka BSOD after enabling Hyper-V (KB974598). The installation of Hyper-V and the loading of C-states during initialization of Hyper-V causes a conflict, resulting in a blue screen. The system uses a C-state that is not supported by Hyper-V.
More at http://deinoscloud.wordpress.com/2009/10/22/disable-c-state-why-that/
Cheers,
Have you measured the performance before and after enabling this? I have a customer who claims that their performance got really bad after enabling this.
Anyone have any numbers as to which is better, HP Dynamic Power Saving Mode or ESX controlled?
@larstr: Would your customer be interested in providing VMware with more details about their particular case? I’m curious what type of workloads were they running that would be so sensitive to our DVFS algorithm. Please feel free contact me by email at andrei at vmware dot com.
@larstr What is the server brand and model and which power regulator mode is used at your customer site? Thx.
really nice feature indeed and a little less invasive (if it in fact doesn’t reduce performance) on the cluster as a whole than DPM. but despite it being mentioned in the resource PDF, is it fully supported by vmware on HCL hardware?
I will visit this customer and consult them within a few weeks. Will upgrade to U1 etc, and do some performance tests.
I just tried this with a cluster of BL460c G1 Blade Servers. These servers were configured with “HP Dynamic Power Savings Mode”. Letting ESXi control the power management doesn’t change the energy consumption on the BL460c G1 blade.
@Dave Convery: Did you see a change in the energy consumption on the BL490c blade?
@Ted Steenvoorden –
Unfortunately, I did NOT do a baseline against the default. I am just a little more comfortable having the hypervisor making the power decisions.