MaximumPC 2004 03

(Dariusz) #1

Prescott Arrives


30 MAXIMUMPC MARCH 2004


Prescott still retains the 12KB
trace cache of the original
Pentium 4 but increases
the L1 data cache to 16KB
from 8KB. Unlike older CPU
designs, like the Pentium III,
the P4 series can store up to
12KB of simple instructions
that have been decoded from
x86 instructions into what
Intel calls micro-ops. Since
the instructions are stored in
a decoded state and ready
to be executed, the CPU
experiences an increase in
performance.

Prescott’s improved process


  • Prescott features a seven-layer vertical process vs.
    Northwood’s six layers, which helps reduce the transistor
    density on the core and reduce the wire delay, or latency
    of signals moving through the core. Think of a layer as a floor
    in a building; more vertical layers saves horizontal space.

    • Prescott CPUs will also use carbon doped oxide to increase
      the wire transfers in the transistors as well as nickel silicide
      to improve speed through lower resistance.

    • Intel will use strained silicon in the Prescott core so
      electrons can flow faster.




Q: What’s so great about Prescott?
Prescott’s architectural advances represent
a significant, forward-thinking leap for
processing technology. However, in terms
of performance, you likely won’t see any
substantial performance benefits any-
time soon. It will take between six and 18
months for applications to take advantage
of the CPU’s much longer pipeline and new
instructions. But with that said, because of
its price and scalability, this will likely be
the CPU we’re using by next year.

Q: Why isn’t this Pentium 5? Is Intel
afraid of the P5 designator?
If you subscribe to the speculations of
forum conspiracy theorists, then, yes,
Intel may have indeed been afraid to call
Prescott Pentium 5. But the reasons may

not be what you expect.
The “performance dread” conspiracy
theory goes like this: Because any CPU
successor should be faster than the chip
it’s replacing, and because the Prescott is
slower than the Pentium 4 Extreme Edition
(read on for details), Intel decided that it
would have a hard time selling a Pentium
5 that ran slower than a Pentium 4.
The “common business sense” con-
spiracy theory is more reasonable. It
goes like this: Not wanting to hurt holi-
day sales of the Pentium 4 by tantalizing
consumers with a brand-new CPU, Intel
chose to keep the P4 nomenclature.
“I’ve been calling it Pentium 5 since last
year,” says analyst Rob Enderle of the
Enderle Group, who forwards the theory
that business sense and not perfor-

mance anxiety is behind the P5-minus-
name. “When they didn’t call it [P5],
it surprised the hell out of me.” Intel’s
strategy seems to have worked because
sales were good this holiday, Enderle
explains. He opines that had Intel
released the Prescott on time last year,
it likely would have been called Pentium
5 to much fanfare.
So what’s the real reason? Intel says
only that there is no exact science for
its CPU names. And even though the
performance theory is juicier, we think
Enderle’s may hold more water. After all,
Intel has introduced a slower processor
before without blinking. In fact, upon its
release, the first Pentium 4 was soundly
trashed in many older applications by
the Pentium III.

Other enhancements
in Prescott:


  • An improved branch
    predictor speeds up
    productivity applications.

  • Twelve additional
    store buffers over the
    Northwood’s 24 help keep
    data moving through the
    ultra-long pipeline.

  • Two more write
    combining buffers (for a
    total of eight) help reduce
    bus traffic.


If you read Tom Halfhill’s
column last month (February,
“Dreaming of a Cacheless
Society”), you know that
caches are a necessary evil
of today’s computers. The
world would be a better place
if CPUs didn’t have to use
big fat caches to make up
for slow-ass system RAM.
Prescott’s cache is double
that of the Northwood 1MB.
Like all CPUs, cache memory
is made from expensive
SRAM. In the Prescott core,
the 1MB of L2 accounts for
roughly 48 million of the 125
million transistors.

There are no additional
floating-point units in the
heart of the Prescott, but
Intel has added an additional
integer multiplication unit
that no longer shares
resources with the floating
point multiplication unit,
which will result in reduced
processing latency.

Q


Q: What’s so great about Prescott?

Q


Q: What’s so great about Prescott?
Prescott’s architectural advances represent
Q

Prescott’s architectural advances represent
a significant, forward-thinking leap for
Q
a significant, forward-thinking leap for
processing technology. However, in terms
Q
processing technology. However, in terms

Q


Q: Why isn’t this Pentium 5? Is Intel

Q


Q: Why isn’t this Pentium 5? Is Intel
afraid of the P5 designator?

Q


afraid of the P5 designator?
If you subscribe to the speculations of
Q

If you subscribe to the speculations of
Qforum conspiracy theorists, then, yes, forum conspiracy theorists, then, yes,

Intel makes tiny steps forward. We’ve got the map!


To get a sense of how tiny the 90nm process is compared with the current 130nm process, note that
Prescott has more than twice the number of transistors than the current Northwood Pentium 4, yet
is 33mm^2 smaller. To imagine the size we’re talking about, consider this: A nanometer is only 3 to 5
atoms wide, and the period at the end of this sentence is 250,000 nanometers across.


  • Prescott CPUs will also use carbon doped oxide to increase


Behold Prescott!

Prescott’s improved process • Prescott CPUs will also use carbon doped oxide to increase

Other enhancements
in Prescott:


  • An improved branch
    predictor speeds up
    productivity applications.

  • Twelve additional
    store buffers over the
    Northwood’s 24 help keep
    data moving through the
    ultra-long pipeline.

  • Two more write
    combining buffers (for a
    total of eight) help reduce
    bus traffic.


There are no additional
floating-point units in the
heart of the Prescott, but
Intel has added an additional
integer multiplication unit
that no longer shares
resources with the floating
point multiplication unit,
which will result in reduced
processing latency.

Instruction
decode

Trace
cache and
logic

CROM

Scheduler and checker

IOIO

BPU

Bus
logic

Memory
control
logic

Integer
execution

Floating
point

1MB cache PLL

IOIO

Prescott’s improved process • Prescott CPUs will also use carbon doped oxide to increase
Free download pdf