Mastering Windows Server 2016 Hyper-V

(Romina) #1

The deciding factor for the number of VMQs that are available to a NIC team depends
on the teaming mode and the load-balancing algorithm used for the team. If the
teaming mode is set to switch dependent, the Min Queues value is always used. If the
teaming mode is switch independent and the algorithm is set to Hyper-V Port or
Dynamic, then the Sum of Queues value is used; otherwise, Min Queues is used. Table
3.2 shows this in simple form.


Table 3.2: VMQ NIC Teaming Options


ADDRESS HASHHYPER-V PORTDYNAMIC
Switch Dependent Min Queues Min Queues Min Queues
Switch Independent Min Queues Sum of Queues Sum of Queues

RSS and vRSS


I previously talked about a 3 to 4Gbps bandwidth limit, which was caused by the
amount of traffic that could be processed by a single processor core. Even with VMQ, a
virtual machine network adapter is still limited to traffic being processed by a single
core. Physical servers have a solution to the single-core bottleneck for inbound traffic:
Receive Side Scaling, or RSS. RSS must be supported by the physical network adapter,
and the technology enables incoming traffic on a single network adapter to be
processed by more than a single processor core. This is enabled by using the following
flow:


1 . Incoming    packets are run through a   four-tuple  hash    algorithm   that    uses    the source
and destination IP and ports to create a hash value.
2 . The hash is passed through an indirection table that places all traffic with the same
hash on a specific RSS queue on the network adapter. Note that there are only a
small number of RSS queues. Four is a common number, so a single RSS queue
will contain packets from many hash values, which is the purpose of the
indirection table.
3 . Each RSS queue on the network adapter is processed by a different processor core
on the host operating system, distributing the incoming load over multiple cores.

Creating the hash value in order to control which RSS queue and therefore which
processor core is used is important. Problems occur if packets are processed out of
order, which could happen if packets were just randomly sent to any core. Creating the
hash value based on the source and destination IP addresses and port ensures that
specific streams of communication are processed on the same processor core and
therefore are processed in order. A common question is, “What about hyperthreaded
processor cores?” RSS does not use hyperthreading and skips the “extra” logical
processor for each core. This can be seen if the processor array and indirection table
are examined for an RSS-capable network adapter, as shown in the following output.
Notice that only even-number cores are shown; 1, 3, and so on are skipped because
this system has hyperthreading enabled and so the hyperthreaded cores are skipped.

Free download pdf