Custom PC - UK (2020-01)

up that allowed Nvidia to head off AMD’s first
Navi10 products at the pass, plus the simpler
TU116 and TU117 implementations of Turing,
it’s time to dive a bit deeper.
We’ve thrown a lot of top-level product
configuration data at you, the GPC and SM
counts in particular, so now it’s time to take a
look at what a Turing GPC actually is and what
the SMs inside of a GPC look like, along with
the rest of the GPC microarchitecture that
makes up what Nvidia have named after our
geniusWorldWarII codebreaker.

We’llincludetheflagshipray-tracing
hardware,plusthemorestaidandexpected
bitsoftheirdesigntoo,comparingand
contrastingtoNvidia’spriordesignsand
AMD’scurrenthardwarewhereit makes
sense,togiveyoua viewofwhyNvidiaput
Turingtogetherthewayit did.

THETURINGGPC
TheGraphicsProcessingCluster(GPC)
isthetop-levelbuilding blockineachof
Nvidia’sarchitecturesinrecentmemory,
andimplementsalmostallofthecore
processing that the GPU undertakes. In fact,
in a six-GPC part such as the TU102, you
could disable five of the six clusters and the
remaining one would run your games just
fine! Let’s look at what a GPC contains.

RASTERISER
Each GPC contains a single rasteriser
that’s responsible for generating pixels
on which the SMs run pixel shading. It
would be unremarkable compared to
the rasteriser before it in Nvidia’s Pascal
architecture, if it weren’t for the fact the
addition of a new rasterisation feature:
variable rate shading, or VRS.
Traditionally, every pixel gets shaded at
the same rate, usually once unless MSAA is
enabled, but with VRS, the rasteriser is capable
of telling the SMs that for a given block of 16
x 16 pixels it generates, the SMs can shade
them at a reduced rate. That reduced shading
rate lets the GPU save on processing every
individual pixel, saving power, and increasing
performance and throughput.

There’s support for several different rates, as you’d expect from the variable part of the name: 1x1 (do nothing different), 2x2 and 4x4 are the obvious ones, but there are non-square footprints too, with 1x2, 2x1, 4x2 and 2x4 all supported as well. So for each 256 pixel block that the rasteriser would traditionally generate and send down to the SMs in a GPC, Turing can effectively say, ‘for each 4 x 4 pixel footprint, I’ll shade just 1 pixel and broadcast that resulttoall16’.

Thegamedeveloperhascontrolover theratesina coupleofways,andcrucially forimagequality,theshadingratepatterns canbenon-squareandthereforecarefully tailoredtothekindsofgeometrythatare likelytofallinsidea rasterisertile.Saythe developerknowsthatina particularscreen region,trianglesarelikelytobetalland skinny.Theycanapplyoneofthe‘taller’ shadingrates—2x4and1x2—totakeinto accounttheshapeofwhat’slikelytobeinthat rasterisedregion.

The throughput of the rasteriser in a Turing GPC is unchanged from Pascal and can process a single input triangle per clock, generating 16 output pixels per clock from it for an SM to work on. That scales per GPC, so a six-GPC part such as TU102 has an aggregate rasteriser throughput of six input triangles per clock.

TPC Each GPC contains a variable number of Texture Processing Clusters or TPCs. A TPC is a collectionoftwoSMsthatsharea texture unitandpolymorphengine.Thesharingof textureprocessingandgeometryprocessing (thepolymorphengine)acrossseveralblocks ofGPUshadersis a commonarrangement thesedaysthatallowsfora betterbalanceof requiredprocessingforthedifferentstepsof thegraphicspipeline. Nvidia’stextureunitinTuringis incredibly powerfulandcansendeightfullyfiltered texturesamples(texels)totheSMsper clock,evenwithbilinearfilteringandforwide HDRpixelswith 64 bitsofdataeach.It’salso capableofservicingrequestsoutoforder, whichis somethingNvidiaarchitectures havebeenofferingfora whilenow,while competingdesignsareonlyjustcatching

THE GRAPHICS PROCESSING CLUSTER GPC IS

THE TOPLEVEL BUILDING BLOCK, IMPLEMENTING

ALMOST ALL THE CORE PROCESSING

The full diagram of TU102 hints at the sheer scale and complexity of this 18.6 billion-transistor chip

Custom PC - UK (2020-01)

Get our desktop app

Company

Features

Documentation

Resources