Mastering Windows Server 2016 Hyper-V

(Romina) #1
network switch to reset. This is configured at the cluster level by using (Get-
Cluster).ResiliencyDefaultPeriod = <value>. Additionally, cluster resources can
have their own value set that overrides the cluster default value: (Get-
ClusterGroup "VM Name").ResiliencyPeriod= <value>. If you set an override
value for a cluster resource and wish to revert to using the cluster default, simply
set the cluster resource’s ResiliencyPeriod to - 1.

The compute resiliency may sound perfect, but there is a challenge related to the
storage of the VMs, as a VM cannot run without its storage. If the VM is using SMB
storage, which has been possible since Windows Server 2012 , then that storage can
still be accessed even when the hosting node is in isolated mode and the VM can
continue to run. If, however, the VM is using cluster storage, such as a block-backed
Cluster Shared Volume, then that storage will not be available because the node can
no longer coordinate with the coordinator node and reliably interact with the storage.
In Windows Server 2012 R 2 , if a VM loses its storage, then the OS inside the VM
would crash. In Windows Server 2016 , however, there is a second kind of resiliency:
storage resiliency.


Transitory storage problems have long been a pain point for Hyper-V environments. A
small blip in storage connectivity will result in widespread crashes of the virtual
machines and significant downtime as VMs restart. Storage resiliency changes the
behavior when a node loses connectivity to the storage for a VM. In Windows Server
2012 R 2 , the VM simply crashes, as previously stated. In Windows Server 2016 ,
Hyper-V will detect the failure to read or write to the VHD/VHDX file, provided it is
stored on a Cluster Shared Volume, and freeze the VM. This results in the VM going
into a Paused-Critical state. This state protects the OS inside the VM from crashing, as
it stays frozen until connectivity to the storage is reestablished, at which time the VM
is thawed and resumes running from its exact state before it was frozen. This
minimizes the downtime to that of the actual storage interruption duration. By
default, a VM can stay in Paused-Critical state for up to 30 minutes; however, this can
be changed per VM.


To configure a VM to use or not use storage resiliency, use the following:


Set-VM - AutomaticCriticalErrorAction


If enabled to set the time, a VM can start in Paused-Critical state use ( 24 hours is the
maximum possible value, and a value of 0 would power off the VM immediately):


Set-VM - AutomaticCriticalErrorActionTimeout


When you put compute resiliency and storage resiliency together for VMs that use
cluster storage, the behavior will be as follows. If a node becomes Isolated because of
a break in communication with the rest of the cluster, the VMs will stay on that node
(for up to 4 minutes) but will go into a Paused-Critical state. Figure 7. 19 shows a node,
savdalhv 93 , whose cluster service has been crashed, which results in its Isolated
status. Note that both the VMs on the node change to Unmonitored, and that VM 3 is

Free download pdf