However, recent versions of HA add support for datastore and virtual machine failures. The expanding feature adds some complexity to VMware HA configuration. Host failure response is the most well-known HA feature. When a host fails, what do you want vSphere to do? Usually, you want virtual machines on that failed host to be powered back on to other functioning hosts within the same cluster. Sometimes a host can become isolated from others in the cluster.
When this happens, VMware calls this a host isolation event. If a host becomes isolated, it could be a simple issue with the management network. We must decide if we want virtual machines to be gracefully powered off and back onto other hosts in the cluster or forcefully powered off and back on again.
The main difference here is the graceful power off of the virtual machines. If they hang while powering off, then they might not be brought back online on other hosts. However, forcefully shutting down a virtual machine could corrupt its data. Instead of defining an action, you can keep the default option to do nothing when a host isolation event occurs. In that case, you can manually reboot virtual machines should this become an issue in your environment. In the context of datastores, this means that the host can no longer communicate with the datastore and considers it unrecoverable.
A PDL state occurs after multiple datastore connection attempts fail. When an APD event occurs, every path to a given datastore is offline. While paths could come back online, vSphere HA lets us set what to do under this condition. You can do nothing which would result in your virtual machines remaining in a failed state. You can also choose to power off and restart affected virtual machines on other hosts, providing those other hosts still have access to the datastore.
Aside from hosts and datastore failure responses, vSphere HA also has an interesting way to monitor and respond to virtual machine failure.
If vSphere does not receive a heartbeat during the configured time, it determines that the virtual machine has failed and reboots it. As always, determine the risk-reward and configure appropriately for the virtual machines in the cluster. To help prevent false positives, you can use the sensitivity options and the maximum number of resets within a time window option.
All hosts in an HA-enabled cluster report their status to each other during an election process. This process occurs during initial HA configuration or when the last elected primary host fails. The primary host is responsible for updating your vCenter server with its status and the status of other hosts within the cluster. Should one of the non-primary hosts fail, the cluster can react and reboot VMs on other hosts.
If the primary host fails, the cluster elects a new one, and the failover process commences. The most common misunderstanding about vSphere HA is that it uses vMotion to move virtual machines from one host to another.
This assumption is incorrect. When a host fails, the VM is already in a powered-off crashed state. HA simply powers the virtual machines back on using another healthy host within the cluster. However, it does need shared storage so that other hosts can access the virtual machine files and power the VM back on after a host failure. A diagram of VMware HA. Provided that you abide by the vSphere HA prerequisites, you should be ready to use HA, but what are the best practices?
Be sure to leverage distributed resource scheduler DRS features within your clusters. By using DRS, you will ensure workloads are balanced over all hosts in your cluster. Imagine a scenario where a host running most of your VMs fails. When the HA cluster is enabled, it requires some minutes before the automatic failover protection is available again since the intial replication may still be in progress.
As backup strategy, backup the Active node only skipping the Passive and Witness nodes. In case you need to restore the Active node , you must remove the HA cluster configuration, restore and re-create the HA cluster. Flag the Force the failover You can view the failover progress from the browser by using, of course, same DNS name or IP address. VMware reports an RTO of 5 minutes , maybe a little bit to high but surely will be improved in future releases.
Configuring the vCenter Server HA ensure an higher level of protection for your virtual environment with no extra licenses required. Testing failover is the good way to make sure that everything works as expected.
HA cluster maintenance The HA cluster is enabled by default and performs an automated failover when a failure occurs. When the HA cluster is disabled, the current Status is displayed in the table.
The vCenter HA is available again and the automatic failover protection is enabled. Backup and Restore As backup strategy, backup the Active node only skipping the Passive and Witness nodes.
0コメント