vSphere 5.1 Update 1 Update Sequence

May 6th, 2013 by jason 1 comment »

Not so long ago, VMware product releases were staggered.  Major versions of vSphere would launch at or shortly after VMworld in the fall, and all other products such as SRM, View, vCloud Director, etc. would rev on some other random schedule.  This was extremely frustrating for a vEvangelist because we wanted to be on the latest and greatest platform but lack of compatibility with the remaining bolt-on products held us back.

While this was a wet blanket for eager lab rats, it was a major complexity for production environments.  VMware understood this issue and at or around the vSphere 5.0 launch (someone correct me if I’m wrong here), all the development teams in Palo Alto synchronized their watches & revd product in essence at the same time.  This was great and it added the much needed flexibility for production environment migrations.  However, in a way it masked an issue which didn’t really exist before by virtue of product release staggering – a clear and understandable order of product upgrades.  That is why in March of 2012, I looked at all the product compatibility matrices and sort of came up with my own “cheat sheet” of product compatibility which would lend itself to an easy to follow upgrade path, at least for the components I had in my lab environment.

vSphere 5.1 Update 1 launched on 4/25/13 and along with it a number of other products were revd for compatibility.  To guide us on the strategic planning and tactical deployment of the new software bundles, VMware issued KB Article 2037630 Update sequence for vSphere 5.1 Update 1 and its compatible VMware products.

Snagit Capture

Not only does VMware provide the update sequencing information, but there are also exists a complete set of links to specific product upgrade procedures and release notes which can be extremely useful for planning and troubleshooting.

The vCloud Suite continues to evolve providing agile and elastic infrastructure services for businesses around the globe in a way which makes IT easier and more practical for consumers but quite a bit more complex on the back end for those who must design, implement, and support it.  Visit the KB Article and give it 5 stars.  Let VMware know this is an extremely helpful type of collateral for those in the trenches.

QuickPrep and Sysprep

May 2nd, 2013 by jason 1 comment »

Those who manage VMware View currently or have used it in the past may be familiar with desktop customization which is required to provide a unique identity on the network for each View Composer VDI session in a pool.  If you’ve got a pretty good Microsoft background, you’re probably already familiar with Sysprep – Microsoft’s tool for customizing Windows server and desktop OS deployments.  VMware View Administrators have an alternative tool which can be used for desktop customization called QuickPrep.  For all intents and purposes, QuickPrep was designed to accomplish many of the same tasks Sysprep did, but the obvious advantage QuickPrep has is that the code and development belongs to VMware and as a result can be tightly integrated with products in VMware’s portfolio.

I was on a call this morning with VMware Senior Technical Trainer Linus Bourque (Twitter: @LinusBourque Blog: http://communities.vmware.com/blogs/lbourque Cigars: yes) when he pulled up a table slide which was the result of VMware KB Article 2003797 Differences between QuickPrep and Sysprep.  For those who are curious about the similarities and differences between the two (like me), look no further.

From the KB Article:

QuickPrep is a VMware system tool executed by View Composer during a linked-clone desktop deployment. QuickPrep personalizes each desktop created from the Master Image. Microsoft Sysprep is a tool to deploy the configured operating system installation from a base image. The desktop can then be customized based on an answer script. Sysprep can modify a larger number of configurable parameters than QuickPrep.
During the initial startup of each new desktop, QuickPrep:
  • Creates a new computer account in Active Directory for each desktop.
  • Gives the linked-clone desktop a new name.
  • Joins the desktop to the appropriate domain.
  • Optionally, mounts a new volume that contains the user profile information.
This table lists the main differences between QuickPrep and Sysprep:
Function QuickPrep Sysprep
Removing local accounts No Yes
Changing Security Identifiers (SID) No Yes
Removing parent from domain No Yes
Changing computer name Yes Yes
Joining the new instance to the domain Yes Yes
Generating new SID No Yes
Language, regional settings, date, and time customization No Yes
Number of reboots 0 1 (seal & mini-setup)
Requires configuration file and Sysprep No Yes
Note: A Guest Customization script is required in vCenter Server to use Sysprep. Sysprep is bundled in with Windows 7. For Windows XP, an appropriate Sysprep program needs to be installed on the vCenter Server.
For information on installing Sysprep tools, see Sysprep file locations and versions (1005593).
For more information on the use of Sysprep and the Guest Customisation wizard, see the Customizing Guest Operating Systems and Installing the Microsoft Sysprep Tools sections of the vSphere Virtual Machine Administration Guide.

vMA 5.1 Patch 1 Released

April 5th, 2013 by jason 1 comment »

Expendable news item here only worthy of a Friday post.  For those who may have missed it, VMware has released an update to the vSphere Management Assistant (vMA) 5.1 appliance formally referred to as Patch 1.  This release is documented in VMware KB 2044135 and the updated appliance bits can be downloaded here.  Log in, choose the VMware vSphere link, then the Drivers & Tools tab.

Patch 1 bundles with it the following enhancements:

  • The base operating system is updated to SUSE Linux Enterprise Server 11 SP2 (12-Jan-2013).
  • JRE is updated to JRE 1.6.0_41, which includes several critical fixes.
  • VMware Tools is updated to 8.3.17 (build 870839).
  • A resxtop connection failure issue has been fixed.
    In vMA 5.1, resxtop SSL verification checks has been enabled. This might cause resxtop to fail when connecting to hosts and displays an exception message similar the following:
    HTTPS_CA_FILE or HTTPS_CA_DIR not set.
    This issue is fixed through this patch.

Redefining Disk.MaxLUN

March 27th, 2013 by jason No comments »

Regardless of what the vSphere host Advanced Setting Disk.MaxLUN has stated as its definition for years, “Maximum number of LUNs per target scanned for” is technically not correct.  In fact, it’s quite misleading.

Snagit Capture

The true definition looks similar stated in English but carries quite a different meaning and it can be found in my SnagIt hack above or within VMware KB 1998 Definition of Disk.MaxLUN on ESX Server Systems and Clarification of 128 Limit.

The Disk.MaxLUN attribute specifies the maximum LUN number up to which the ESX Server system scans on each SCSI target as it is discovering LUNs. If you have a LUN 131 on a disk that you want to access, for example, then Disk.MaxLUN must be at least 132. Don’t make this value higher than you need to, though, because higher values can significantly slow VMkernel bootup.

The 128 LUN limit refers only to the total number of LUNs that the ESX Server system is able to discover. The system intentionally stops discovering LUNs after it finds 128 because of various service console and management interface limits. Depending on your setup, you can easily have a situation in which Disk.MaxLUN is high (255) but you see few LUNs, or a situation in which Disk.MaxLUN is low (16) but you reach the 128 LUN limit because you have many targets.

For more information about limiting the number of LUNs visible to the server, see http://kb.vmware.com/kb/1467.

Note the last sentence in the first paragraph above in the KB article.  Keep the value as small as possible for your environment when using block storage.  vSphere ships with this value configured for maximum compatibility out of the box which is the max value of 256.  Assuming you don’t assign LUN numbers up to 256 in your environment, this value can be immediately ratcheted down in your build documentation or automated deployment scripts.  Doing so will decrease the elapsed time spent rescanning the fabric for block devices/VMFS datastores.  This tweak may be of particular interest at DR sites when using Site Recovery Manager to carry out a Recovery Plan test, a Planned Migration, or an actual DR execution.  It will allow for a more efficient use of RTO (Recovery Time Objective) time especially where multiple recovery plans are run consecutively.

vExpert 2013 Applications Available

March 22nd, 2013 by jason 3 comments »

John Troyer (you know him as @jtroyer on Twitter or the guy with the disco ball jacket at social events) has made the announcement that vExpert 2013 applications are now available. Simply put, a vExpert is the formal recognition, by VMware, of being a virtualization rock star.  I haven’t read the latest charter but technically speaking I don’t think one even needs to specifically be a VMware virtualization rock star (hey, we’re all in this virtualization space together for the greater good right?) but it certainly helps.

There are three separate but interrelated tracks to being recognized as a vExpert

  • Evangelist – You’re a blogger, regular speaker, VMTN contributor, etc. who shares the passion with the rest of the community.  You might be employed, but not by VMware or a partner. Nobody really knows.
  • Customer – You’re a customer internal facing proxy evangelist if that makes any sense whatsoever.  You get it.  You make sure your internal organization gets it.
  • VPN (VMware Partner Network) – You work for VMware or a partner and you’re either a rock star by choice or by force.  Either way, you know your stuff and you’re good at sharing with your customers.

The paths are separate but they all converge on fundamental traits within the virtualization community:  Passion. Enthusiasm. Leadership. Knowledge. Outreach.

If you’ve made contributions in any of the areas listed above, consider filling out an application for yourself.  Now is not the time to be modest or bashful.  It is the time to be showered with gifts of VMware licensing and the type of real world respect that is recognized in every corner of the globe.

My application is submitted and I’ve got my fingers crossed.  If I make vExpert 2013, I’ll be in the exclusive Five Timers club (vExpert 2009-2013 inclusive).  Why I remember so long ago receiving the news of my first vExpert award… I was at VMworld Europe in Cannes…

 

Seriously, here are the important links you need from VMware:

Recommend that someone apply for vExpert 2013: http://bit.ly/vExpert2013recommend

Apply for vExpert 2013: http://bit.ly/vExpert2013application The deadline for applications is April 15, 2013 at midnight PDT.

The existing VMware vExpert 2012 directory is at http://communities.vmware.com/vexpert.jspa.

For questions about the application process or the vExpert Program, use the comments below or email vexpert@vmware.com.

VMware vSphere Design 2nd Edition Now Available

March 20th, 2013 by jason No comments »

Snagit Capture

Publication Date: March 25, 2013 | ISBN-10: 1118407911 | ISBN-13: 978-1118407912 | Edition: 2

The big splash was officially made yesterday but I’m following up with my announcement a day later to help spread the message to anyone who may have been heads down and missed it.  Forbes Guthrie (Snagit Capture Snagit Capture), Scott Lowe (Snagit Capture Snagit Capture), and Kendrick Coleman (Snagit Capture Snagit Capture) have teamed up to produce VMware vSphere Design 2nd Edition (a followup refresh of the popular 1st Edition).

As Technical Editor, I’m one of the few fortunate individuals who have already had the pleasure to have read the book.  I can tell you that it is jam-packed with the deep technical detail, design perspective, and methodology you’d expect from these seasoned and well-respected industry experts.

The book is 528 pages in length (compare to 384 pages in the 1st edition).  New in this version is coverage of vSphere 5.1, emerging infrastructure technologies and trends, as well as a section on vCloud Director design – a worthy topic which should be weighing heavily on the minds of many by now and in the future will likely spawn dedicated coverage in texts by Sybex and/or other publishers.

The publisher has made the introduction section of the book freely available.  You can take a look at that by clicking this link which is hosted at Forbes vReference blog.  As with the previous edition, this book is made available in both paperback and Kindle editions.  Support these authors and pick up your copy today.  Tell them Jason sent you and nothing special will likely take place.

Large Memory Pages and Shrinking Consolidation Ratios

March 19th, 2013 by jason 10 comments »

Here’s a discussion that has somewhat come full circle for me and could prove to be a handy for those with lab or production environments alike.

A little over a week ago I was having lunch with a former colleague and naturally a TPS discussion broke out.  We talked about how it worked and how effective it was with small memory pages (4KB in size) as well as large memory pages (2MB in size).  The topic was brought up with a purpose in mind.

Many moons ago, VMware virtualized datacenters consisted mainly of Windows 2000 Server and Windows Server 2003 virtual machines which natively leverage small memory pages – an attribute built into the guest operating system itself.  Later, Windows Vista as well as 2008 and its successors came onto the scene allocating large memory pages by default (again – at the guest OS layer) to boost performance for certain workload types.  To maintain flexibility and feature support, VMware ESX and ESXi hosts have supported large pages by default providing the guest operating system requested them.  For those operating systems that still used the smaller memory pages, those were supported by the hypervisor as well.  This support and configuration remains the default today in vSphere 5.1 in an advanced host-wide setting called Mem.AllocGuestLargePage (1 to enable and support both large and small pages – the default, 0 to disable and force small pages).  VMware released a small whitepaper covering this subject several years ago titled Large Page Performance which summarizes lab test results and provides the steps required to toggle large pages in the hypervisor as well as within Windows Server 2003

As legacy Windows platforms were slowly but surely replaced by their Windows Server 2008, R2, and now 2012 predecessors, something began to happen.  Consolidation ratios gated by memory (very typical mainstream constraint in most environments I’ve managed and shared stories about) started to slip.  Part of this can be attributed to the larger memory footprints assigned to the newer operating systems.  That makes sense, but this only explains a portion of the story.  The balance of memory has evaporated as a result of modern guest operating systems using large 2MB memory pages which will not be consolidated by the TPS mechanism (until a severe memory pressure threshold is crossed but that’s another story discussed here and here).

For some environments, many I imagine, this is becoming a problem which manifests itself as an infrastructure capacity growth requirement as guest operating systems are upgraded.  Those with chargeback models where the customer or business unit paid up front at the door for their VM or vApp shells are now getting pinched because compute infrastructure doesn’t spread as thin as it once did.  This will be most pronounced in the largest of environments.  A pod or block architecture that once supplied infrastructure for 500 or 1,000 VMs now fills up with significantly less.

So when I said this discussion has come full circle, I meant it.  A few years ago Duncan Epping wrote an article called KB Article 1020524 (TPS and Nehalem) and a portion of this blog post more or less took place in the comments section.  Buried in there was a comment I had made while being involved in the discussion (although I don’t remember it).  So I was a bit surprised when a Google search dug that up.  It wasn’t the first time that has happened and I’m sure it won’t be the last.

Back to reality.  After my lunch time discussion with Jim, I decided to head to my lab which, from a guest OS perspective, was all Windows Server 2008 R2 or better, plus a bit of Linux for the appliances.  Knowing that the majority of my guests were consuming large memory pages, how much more TPS savings would result if I forced small memory pages on the host?  So I evacuated a vSphere host using maintenance mode, configured Mem.AllocGuestLargePage to a value of 0, then placed all the VMs back onto the host.  Shown below are the before and after results.

 

A decrease in physical memory utilization of nearly 20% per host – TPS is alive again:

Snagit Capture Snagit Capture

 

124% increase in Shared memory in Tier1 virtual Machines:

Snagit Capture Snagit Capture

 

90% increase in Shared memory in Tier3 virtual Machines:

Snagit Capture Snagit Capture

 

Perhaps what was most interesting was the manner in which TPS consolidated pages once small pages were enabled.  The impact was not realized right away nor was it a gradual gain in memory efficiency as vSphere scanned for duplicate pages.  Rather it seemed to happen in batch almost all at once 12 hours after large pages had been disabled and VMs had been moved back onto the host:

Snagit Capture

 

So for those of you who may be scratching your heads wondering what is happening to your consolidation ratios lately, perhaps this has some or everything to do with it.  Is there an action item to be carried out here? That depends on what your top priority when comparing infrastructure performance in one hand and maximized consolidation in the other.

Those who are on a lean infrastructure budget (home lab would be an ideal fit here), consider forcing small pages to greatly enhance TPS opportunities to stretch your lab dollar which has been getting consumed by modern operating systems and and increasing number of VMware and 3rd party appliances.

Can you safely disable large pages in production clusters? It’s a performance question I can’t answer that globally.  You may or may not see performance hit to your virtual machines based on their workloads.  Remember that the use of small memory pages and AMD Rapid Virtualization Indexing (RVI) and Intel Extended Page Tables (EPT) is mutually exclusive.  Due diligence testing is required for each environment.  As it is a per host setting, testing with the use of vMotion really couldn’t be easier.  Simply disable large pages on one host in a cluster and migrate the virtual machines in question to that host and let them simmer.  Compare performance metrics before and after.  Query your users for performance feedback (phrase the question in a way that implies you added horsepower instead of asking the opposite “did the application seem slower?”)

That said, I’d be curious to hear if anyone in the community disables large pages in their environments as a regular habit or documented build procedure and what the impact has been if any on both the memory utilization as well as performance.

Last but not least, Duncan has another good blog post titled How many pages can be shared if Large Pages are broken up?  Take a look at that for some tips on using ESXTOP to monitor TPS activity.

Update 3/21/13:  I didn’t realize Gabrie had written about this topic back in January 2011.  Be sure to check out his post Large Pages, Transparent Page Sharing and how they influence the consolidation ratio.  Sorry Gabrie, hopeuflly understand I wasn’t trying to steal your hard work and originality :)