Improving VM performance

The virtual model is all about sharing resources. It is important to maximize performance of all guest VMs. because you want applications to run at their best in addition to not wasting resources that could otherwise go to another guest VM. This goes a long way towards getting higher consolidation ratios, and with it a higher return on investment. (And, as I'm sure your finance department would be happy to tell you, higher ROI is a Very Good Thing.)

Because of the enormous consolidation ratios modern RAM/CPU densities offer, the strategy is to eke out as much performance as possible from every VM. Even very small performance gains can add up when they are multiplied by the number of VMs running in an environment, since VM performance ultimately hinges on the available resources. We will take a quick look at some ideas to consider after the break.

Size VMs appropriately

One of the largest causes of virtual headaches, rightsizing VMs is no small task. Undersize a machine, and your applications will be starved of resources and run poorly. Oversize machines, and your whole environment can be affected. Too-big VMs are much more common, and can cause performance problems such as co-stop as the environment gets closer to an oversubscribed state. It is best to make every effort to keep your VMs as small as possible.

One thing that people often don't consider is that as you add CPUs, the overhead required to schedule and manage goes up exponentially. Your environment is, in effect, being hit twice by the expanded CPU count. This problem has been lessened in newer versions of ESXi, but it is still not zero.

Remove unused/unneeded devices

A virtual machine is created with a number of devices that impede performance and take up OS resources. Devices such as CDROM and Floppy drives, COM and LPT ports should be removed unless they are needed. Templates that are used in an environment should have these removed, saving an administrator the effort of getting rid of them later.

OS Settings to maximize performance

There are many ways to optimize an OS to run in a virtual environment. Things like disabling X in Linux, or running Windows 2012 in core mode obviously help, but are dependent on environment and may not be practical as global best-practices. However, things like disabling screen-savers, power controls, and enabling max performance mode are going to help universally.

People running in a virtual desktop environment may be familiar with the VMware View Optimization Guide, which contains many VM performance best practices as well as Windows operating system tweaks. The entire document is worth reading even if you are not a VDI administrator.

Keep snapshots under control

As long as a snapshot is present it is having an effect on performance. This is invisible for the most part, but if snapshots get large, the VM will begin to have an I/O lag that is measurable. This is because the reads and writes to the VM have to interact to a delta file and the underlying disk. The snapshot process also creates additional overhead as the ESXi host works to keep track of the snapshot. Start creating a lot of snapshots, and forgetting about them, and you're going to have a problem. This is why VMware's best practice for snapshots has long been a max age of 72 hours.

While there are many ways to look at snapshot behavior from 3rd party tools, there is actually a pre-built alarm within vSphere that can be leveraged. It is based on snapshot size rather than age, but enabling it and setting it at a reasonable size (2-4GB is a good starting point) will at least generate alerts that will be visible in the client.

SSD support: Swap to local SSD

This one is more infrastructure-driven, as it requires an investment into physical disk, but it's worth mentioning. From vSphere 5 and up, ESXi will automatically detect a local solid-state drive (SSD) and allow for part of that disk to be set up as a local cache for VM swap files. In some cases, this VMware SSD configuration can greatly improve performance.

If a host can't get enough RAM to a VM, it is forced to swap to disk. By enabling the Swap to Host Cache option, this swap will hit local SSD first, before going to shared disk. Due to the faster performance of low latency local SSD, this should result in a much-reduced performance hit should a VM ever need to swap at all. 

Labels: ,