Reversing : The Hacker's Guide to Reverse Engineering

(ff) #1
Intel NetBurst

The IntelNetBurstmicroarchitecture is the current execution environment for
many of Intel’s modern IA-32 processors. Understanding the basic architec-
ture of NetBurst is important because it explains the rationale behind the opti-
mization guidelines used by almost every IA-32 code generator out there.


μops (Micro-Ops)

IA-32 processors use microcode for implementing each instruction supported
by the processor. Microcode is essentially another layer of programming that
lies within the processor. This means that the processor itself contains a much
more primitive core, only capable of performing fairly simple operations
(though at extremely high speeds). In order to implement the relatively com-
plex IA-32 instructions, the processor has a microcode ROM, which contains
the microcode sequences for every instruction in the instruction set.
The process of constantly fetching instruction microcode from ROM can cre-
ate significant performance bottlenecks, so IA-32 processors employ an execu-
tion trace cache that is responsible for caching the microcodes of frequently
executed instructions.


Pipelines

Basically, a CPU pipeline is like a factory assembly line for decoding and exe-
cuting program instructions. An instruction enters the pipeline and is broken
down into several low-level tasks that must be taken care of by the processor.
In NetBurst processors, the pipeline uses three primary stages:



  1. Front end: Responsible for decoding each instruction and producing
    sequences of μops that represent each instruction. These μops are then
    fed into the Out of Order Core.

  2. Out of Order Core: This component receives sequences of μοps from
    the front end and reorders them based on the availability of the various
    resources of the processor. The idea is to use the available resources as
    aggressively as possible to achieve parallelism. The ability to do this
    depends heavily on the original code fed to the front end. Given the
    right conditions, the core will actually emit multiple μops per clock
    cycle.

  3. Retirement section: The retirement section is primarily responsible for
    ensuring that the original order of instructions in the program is pre-
    served when applying the results of the out-of-order execution.


Low-Level Software 65
Free download pdf