Maximum PC - USA (2022-02)

(Maropa) #1

GOLDEN COVE


PERFORMANCE


CORES
We’ve seen hybrid processor designs in the
world of Arm chips for a while, and Intel has
dabbled with them for its Lakefield processors.
Those had four “efficiency” cores and one
“performance” core, using the Tremont
(used in recent Atom, Pentium Silver, and
Celeron CPUs) and Sunny Cove (i.e., Icelake)
architectures, respectively. Launched in mid-
2020, Intel has already retired Lakefield, but
it was likely more of a proof of concept. With
the groundwork laid, Alder Lake is taking the
hybrid core approach with x86 to new levels.
The star of the show is the big and powerful
Golden Cove cores. Golden Cove is the logical
successor to the Intel CPU cores we’re familiar
with—meaning, the previous generation
Icelake and Rocket Lake cores, which followed
the various Skylake derivatives. Each Golden
Cove core supports Hyper-Threading, Intel’s
name for symmetric multi-threading (SMT)
where a single core can execute code from
two different instruction threads at the same
time, sharing some of its resources.
Golden Cove was designed as the
performance building block for all of
Intel’s upcoming CPUs, so it will see use in
everything from low-power laptop chips to
massive Xeon processors used in the data
center. Along with various improvements
to the pipelines and buffers, one interesting
tidbit is that Golden Cove supports AVX512...
but it’s disabled in the consumer desktop
parts. That’s because AVX512 is more for
certain data center workloads, and because
Intel wanted a unified instruction set for
the Golden Cove and Gracemont cores—and
Gracemont doesn’t support AVX512.
Intel followed the mantra of “Wider,
Deeper, Smarter” with the Golden Cove
architecture. Key improvements include a
6-wide instruction decoder (up from 4-wide in
previous architectures) coupled with a 6-wide
microarchitecture (up from 5-wide in Sunny
Cove). The micro-op cache has increased from
2.25K to 4K entries, and there are now 12
execution ports instead of ten in the previous
architecture. The re-order buffer is also larger,
with 512 entries (vs. 352 in Rocket Lake).
There’s also a fifth integer ALU (arithmetic
logic unit), and the L2 cache per core has
increased to 1.25MB (up from 512KB)—or 2MB
on server parts (up from 1.25MB).
Golden Cove includes other refinements
too, like a faster 1-cycle LEA (Load Effective
Address) instruction on all five ALU ports,
a new fast adder, and there’s a third load
port, with deeper and larger load and store
buffers. The FP16 (16-bit floating-point) format
is now supported by AVX512—or it will be

on future Sapphire Rapids data center CPUs.
Intel has also added a new AMX (Advanced
Matrix Extensions) instruction set, which
can compute up to 2048 int8 operations
concurrently. It’s like the tensor cores we’ve
seen in Nvidia’s recent GPUs, except in Intel’s
CPU cores. Like AVX512, AMX support has
been disabled for consumer Alder Lake parts.

GRACEMONT


EFFICIENCY CORES
The other half of the CPU cores comes via
the Gracemont architecture. This is the third
generation out-of-order Atom architecture—
the first two Atom architectures were in-
order designs. A Gracemont core is like
taking a Skylake core, stripping out some of
the cruft, using a smaller but faster cache,
and simplifying it so you have a small and
efficient processor core. According to Intel,

XE GRAPHICS RIDES AGAIN


Intel’s Alder Lake
uses the same Xe
Graphics (Gen12 GPU)
that we’ve already
seen in 11th Gen Tiger
Lake and Rocket Lake
processors. This is still
Xe LP DG1, in other
words, not an updated
DG2 with additional
features like ray tracing
hardware and matrix
cores. It’s been scaled
down to the Intel 7 node
from Rocket Lake, with
clock speeds reaching
up to 1.55GHz. That’s
250MHz higher than the
14nm Xe Graphics used
in Rocket Lake, but it’s
not going to compete
with any competent
graphics card.

Intel provides up
to 32 EUs (execution
units), with a theoretical
compute rate of just
794 GFLOPS—yes, it’s
so slow that we’re back
to GFLOPS instead of
TFLOPS. The relatively
anemic GTX 1050
should be about three
times as fast, while a
more recent “budget”
GPU (shortages
notwithstanding) like
the GTX 1650 Super
potentially delivers six
times the performance
of the UHD 770.
As we’ve said before,
Intel’s goal isn’t to
displace dedicated
GPUs—a market it will
be entering shortly

with its Arc graphics
portfolio. Instead,
integrated graphics
on the desktop and
high-end laptop parts
is about providing a
decent set of base
functionality that won’t
use a lot of power. The
UHD 770 includes HEVC,
VP9, and SCC encoders
that support up to
4K60 HDR content, and
hardware-accelerated
AV1 decode support for
4K60 as well.

Nope, Intel isn’t
planning on stuffing
its latest Arc GPUs into
Alder Lake, though it
might show up in the
13th Gen Raptor Lake.

FEB 2022 MAXIMU MPC 31


© INTEL

Free download pdf