Analyze Performance Metrics to Calculate Host Failure Requirements and Determine Optimum Cluster Size

by admin

My previous article covering the VCAP-DCA objectives looked at analysing a vSphere environment to determine which admission control policy to use on a HA cluster. To do so however, you need to know what your host failure requirements are – that is, you need to know how many hosts you need to accommodate the virtual machine workloads, and how much resource you need to keep spare in case of fail over. To do so, you need to analyse the performance data of the virtual machines in your cluster. It is important to review your host failure requirements whenever changes (e.g. hosts added/removed) are made to the cluster.

There are a few ways to analyse your virtual machine’s performance. You can use the performance data in the vSphere client, to review memory and CPU utilisation (amongst other things):


You can also use ESXTOP in batch mode to capture performance data over time. It’s important to take a reasonably long view of this to account to infrequent resource usage spikes (such as seen with month end reporting etc). It’s also useful to know the function of the virtual machines in the cluster, and how they are used.

It’s important that the virtual machines in the cluster are sized correctly. Virtual machine reservations are an important concept to be aware of here. If reservations are used, it’s important that the are set to appropriate values. For example, if a virtual machine is over allocated in terms of memory, and a reservation is in place, then resource will be wasted as, firstly, that memory will be used on the host where the virtual machine is running, and will be reserved in the cluster in case of failover Be aware that if you are using slots (number of failures cluster can tolerate) admission control policy, then large reservations will result in much lower virtual machine consolidation ratios, unless a custom slot size is applied.

With a good understanding of how much resource your virtual machines need, and how much they will consume, you can then determine the optimum cluster size to accommodate those workloads, whilst providing enough spare resource to handle any host(s) failure.

Keep up to date with new posts on - Follow us on Twitter:
Be Sociable, Share!

Leave a Comment


Previous post:

Next post: