Transparent Failover provides protection from a node failure, enabling a share to
move between nodes in a manner completely transparent to the SMB clients and
maintaining any locks and handles that exist as well as the state of the SMB
connection.
The state of that SMB connection is maintained over three entities: the SMB client,
the SMB server, and the disk itself that holds the data. SMB Transparent Failover
ensures that there is enough context to bring back the state of the SMB connection to
an alternate node in the event of a node failure, which allows SMB activities to
continue without the risk of error.
It’s important to understand that even with SMB Transparent Failover, there can still
be a pause to I/O, because the LUN still has to be mounted on a new node in the
cluster. However, the Failover Clustering team has done a huge amount of work
around optimizing the dismount and mount of a LUN to ensure that it never takes
more than 25 seconds. This sounds like a lot of time, but realize that is the absolute
worst-case scenario, with large numbers of LUNs and tens of thousands of handles.
For most common scenarios, the time would be a couple of seconds, and enterprise
services such as Hyper-V and SQL Server can handle an I/O operation taking up to 25
seconds without error in that worst possible case.
There is another cause of a possible interruption to I/O and that’s the SMB client
actually noticing that the SMB server is not available. In a typical planned scenario
such as a node rebooting because it’s being patched, it will notify any clients who can
then take actions. If a node crashes, though, no notification to the client occurs and so
the client will sit and wait for TCP time-out before it takes action to reestablish
connectivity, which is a waste of resources. Although an SMB client may have no idea
that the node it’s talking to in the cluster has crashed, the other nodes in the cluster
know within a second, thanks to the various IsAlive messages that are sent between
the nodes. This knowledge is leveraged by a witness service capability that was first
available in Windows Server 2012. The witness server essentially allows another node
in the cluster to act as a witness for the SMB client, and if the node the client is talking
to fails, the witness node notifies the SMB client straightaway, allowing the client to
connect to another node, which minimizes the interruption to service to a couple of
seconds. When an SMB client communicates to an SMB server that is part of a cluster,
the SMB server will notify the client that other servers are available in the cluster, and
the client will automatically ask one of the other servers in the cluster to act as the
witness service for the connection.
No manual action is required to take advantage of SMB Transparent Failover or the
witness service. When you create a new share on a Windows Server 2012 or above
failover cluster, SMB Transparent Failover is enabled automatically.
SMB SCALE-OUT
In the previous section, I explained that there would be a pause in activity because the
LUN had to be moved between nodes in the file server cluster, but this delay can be