Custom PC – October 2019

(sharon) #1

productbasedonRDNAfortheforeseeable
future.Infact,Navi10andtheRadeon
RX5700-seriesproductlinearethefirst
consumerPCI-E4 graphicsproductsonthe
market.Supportforit is alsorequiredonthe
PCsideif youwantthefullbandwidth,which
todaymeansRyzen3000-seriesprocessors
pluggedintoAMDX570mainboards.
ComparedwithPCI-E3,thenewstandard
offersdoublethebandwidthperpin,which
meansthesamenumberofpinsand
thesame16xslotsize.Thatalsomeans
compatibility- pluga RadeonRX 5700 XT
intoa systemthatsupportsanolderversion
ofthestandardandit willworkjustfine,
justslower—andonlya tinybitslowerin
practiceintoday’sgames.Still,it’sa solid
bitoffutureproofingfromAMDontheI/O
side,andbefitstheadvancedtechnologies
foundintherestofRDNA-baseddesigns.
NowthatweknowhowRDNAgetsdata
fromthehosttotheGDDR6memoryto
workonwhilerendering,let’slookatwhat
happensinsidethemicroarchitecture
asit doesso,startingwiththefirstpart
ofthedesignthatgetstoworkduring
graphicsrendering:thefrontend.


RDNAFRONTEND
InanyGPU,thefirsttaskforthedesignis to
workonprocessinginputgeometryinthe
formoftriangles.Everythingyou
canseeonthescreenduring
normalrasterisedrenderingis
madeupofthem,andwithgame
vendorspushingtheboundaries
ofvisualfidelityeveryyear,the
countanddensityofgeometry
senttotheGPUonlyincreases.You
needa frontendthatcanprocessthemata
highrate,andgetthemsentdownthepipeline
totherasteriserforonwardprocessing
withasfewbottlenecksaspossible.
AMDhasbeendiligentlyworkingon
improvementstothataspectofthedesign
inordertoimprovethroughputthrough
thefrontend,andRDNAbearsthefirst
trulyvisiblefruitsofthatlabour.Through
a combinationofhardwarechangesand
compiler-heavylifting,dependingonwhat
yourgameasksfromtheGPU,youcannow
seesomesignificantperformanceincreases
comparedwithpriorAMDGPUarchitectures.
RDNA’sfrontendarchitecturetriesto
exploitthefactthat,onaverage,notevery
trianglewillendupvisibleonthescreen.


Forexample,somewill
endupoutsidethevisible
boundariesofthescreenand
canthereforebeculled,or
somewillendupback-facing,
wheretheshadedsideof
thetriangleis awayfrom
thescreenandsodoesn’t
needtobeprocessed.In
addition,somewillendup
withoutanyvisibleareato
rasterise,evenif they’re
on-screenorfront-facing.
Thehardwareand
softwareforRDNAwork
intandemtoprocess
trianglestogethertoexploit
thosepropertiesofsome
triangles,inordertoidentify
if theydon’tcontributeto
on-screenpixels,andthen
cullthemmoreefficiently.
ThecentralGeometry
Processorhandlestheguts
ofthatworkinhardware,
coordinatingwiththepairofPrimitiveUnits
thatresideinsideeachShaderEngine,
whichis thetop-levelbuildingblockofany
RDNA-basedGPU.We’llcomebacktothat
attheend,whenwetieupRDNAtocreate
thefullpictureofhowit’sputtogether.

EachPrimitiveUnitcanaccepta pairof
trianglesfromtheGeometryProcessor
perclock,andthenoutputa singleone
forfurtherprocessingbytherestofthe
design.That2:1throughputis designed
tomatcha rough 50 percentcullratefor
incominggeometry,which,broadlyspeaking
anyway,is whatgamestendtosendto
thehardwaretobedrawn.Theaggregate
onwardtrianglerateforNavi10,whichhas
twoShaderEnginesandconsequentlyfour
PrimitiveUnits,is fourtrianglesperclock.

RASTERISATION
There’snotthatmuchtosaywhenit comes
totherasterisersinRDNAcomparedwith
GraphicsCoreNext(GCN)andVega,and

wemayaswellrefertobothGCNandVega
asGCNhere,becauseinpractice,Vegais
verysimilar.There’sa pairofrasterisers
perShaderEngineinRDNA,andeachone
is capableofacceptinga singletriangleper
clock,whichit thenrasterisesata rateof
16 outgoingpixelsperclock.
Thetypeofrasterisationthat
RDNAuses,andGCNbeforeit,
is calledscanconversion.The
implementationdetailofhow
thatworksis reasonablycomplex
inhardware,butconceptually
straightforwardforustoimagine.
Thehardwarecomputestheinteriorarea
ofthetrianglegivenitscoordinateson
thescreen,andscansalongthatinside
area– intermsofthehorizontalrowof
pixelsthetrianglecovers– linebyline.

NEWSHADERCORE
AMDdoesn’tchangetheshadercore
architectureofitsGPUsveryoften.In
fact,thenewshading microarchitecture
inRDNAis onlyAMD’sfourthcompletely
newprogrammableshadercoredesign
since2001.Someofyoumayremember
theATIRadeon 9700 Pro,thefirstGPUin
ATI’shistorywitha programmableshader
architecture.Thatcarddidn’thitthemarket
untilthesecondhalfof2002,almost

WHEREASGCNWASA 64-WIDE
MACHINE,MEANINGITEXECUTEDA

64-WIDEWAVEOFTHREADS,RDNA
ISA 32-WIDEMACHINEATITSHEART

Each Primitive Unit can accept
a pair of triangles from the
Geometry Processor per clock

FEATURE/ ANALYSIS

Free download pdf