unavailable.
Intra-cluster communication traffic related to cluster database changes and state
changes is light but does vary slightly, depending on the type of workload. Our focus is
Hyper-V, which has light intra-cluster communications, but a SQL or Exchange cluster
tends to have a higher amount of traffic. Once again, though, the size of the pipe is not
as important as the latency. This is because in the event of a cluster state change, such
as a node being removed from the cluster, the state change is synchronous among all
nodes in the cluster. This means before the state change completes, it must have been
synchronously applied to every node in the cluster, potentially 64 nodes. A high-
latency network would slow state changes in the cluster and therefore affect how fast
services could be moved in the event of a failure.
The final type of communication over the cluster network is CSV I/O redirection.
There are two types of CSV communication, which I cover in detail later in this
chapter, but both use SMB for communication. Metadata updates, such as file extend
operations and file open/close operations, are lightweight and fairly infrequent, but
they are sensitive to latency because latency will slow I/O performance. In
asymmetric storage access, all I/O is performed over the network instead of just the
Metadata. This asymmetric access, or redirected mode, is not the normal storage mode
for the cluster and typically happens in certain failure scenarios such as a node losing
Direct Access to the storage and requiring its storage access to be fulfilled by another
node. If asymmetric access is used, the bandwidth of the network is important to
handle the I/O.
The takeaway from the preceding explanation is that typically the bandwidth is not
important; the latency is the critical factor, which is why traditionally the cluster had a
dedicated network. As described in Chapter 3, it is now possible to use a converged
network, but you should leverage quality of service (QoS) to ensure that the cluster
network gets the required bandwidth and, more important, priority for its traffic
because a high priority will ensure as low a latency level as possible. In Chapter 3, I
focused on the bandwidth aspect of QoS because for most workloads that is most
critical. However, you can also use QoS to prioritize certain types of traffic, which we
want to do for cluster traffic when using converged fabric. The code that follows is
PowerShell for Windows Server 2016 (and it also works on Windows Server 2012 R2)
that sets prioritization of traffic types. Note that the priority values range from 0 to 6,
with 6 being the highest priority.
Once created, the policies can be applied using the Set-NetQoSPolicy cmdlet:
New-NetQoSPolicy "Cluster" -Cluster -Priority 6
New-NetQoSPolicy "Live Migration" -LiveMigration -Priority 4
You can find details on New-NetQoSPolicy and the types of built-in filters available
here: