This article has strong refereneces to "...
HA deepdive ..." article on yellowbricks, peperation for VCDX certification and years of experience implmenting Virtual enviroments (since 2004). It is a work in progress...
The vmware clusters rely heavily on red hat clustering/disk shareing and the legato clustering technology.
HA for VMware purpose is very similiar to a MSCS (MSFT ) cluster. HA clusters have at least 2 nodes, and are primarily for fail-over of a VM after its host fails. It should be noted that the VM will momentarily shutdown and restart when a HA fail-over occurs.
It should also be noted that depending on the host isolation settings (or not... :) ), a restart of the host console/mgmt services ( mgmt-vmware service), could trigger a restart/shutdown of the VM's configured in the cluster &/or VM startup options.
Basically, the goal of HA is to determine if a ESX host has/is failing and restart those machines on another node in the cluster. A host is determined to be isolated by verfiying heartbeat between the between the other primary nodes of the cluster and itself. So... if there are are only 2 nodes in the cluster, almost any connectivity problem will cause the nodes to think they are isolated. To best keep a cluster stable, use at least 3 nodes, and ensure that gateway access will not provide false positives of cluster failures.
Keep in mind that the nodes use heartbeats from both the ESX kernals and console to determine if the host are isolated.
Also, note that since the HA function is cold fail-over of a VM, differeces in procesor types is not critical for cluster configuration (but is critical for DRS (VMware's loadbalancing)). EVC can help for CPU's that are close (all intel 55xx or 54xx), and will work even accross processors arhcitectures ( Intel & AMD ) provided none of the guestt OS's are x64. The number of processors and memory is ciritcal in all cases, as they are required to allow a VM to start.
That being said, remember, the VMware ESX cluster also has a DRS function, which is much more restrictive on processor class because DRS can move VM's using Vmotion to loadbalance a cluster.
Under the hood, Hypervisors use the VM swap file (not to be confused with the guest OS swap file) to store the crash state of the VM. If an HA situation occurs and the VM does not actually shutdown/fail because the isolated host is still running, fail-over will not occur, and in some cases the VM could still be functioning, even though the host is not manageable. If the VM is not functional, physical intervention of the isolate host will be required to free the VM so that another node in the cluster can host the VM!
more to come.....