.1. X86
Listing 2: Intel C++ 2011
_f1 PROC NEAR
mov ecx, DWORD PTR [4+esp] ; ecx = a
lea edx, DWORD PTR [ecx+ecx*8] ; edx = a*9
lea eax, DWORD PTR [edx+ecx*4] ; eax = a*9 + a*4 = a*13
ret
These two instructions performs faster than one IMUL.
MOVSB/MOVSW/MOVSD/MOVSQcopy byte/ 16-bit word/ 32-bit word/ 64-bit word from the address
which is in SI/ESI/RSI into the address which is in DI/EDI/RDI.
Togetherwith the REP prefix, it is to be repeatedin a loop, the count is to be storedin the CX/ECX/RCX
register: it works like memcpy() in C. If the block size is known to the compiler in the compile stage,
memcpy() is often inlined into a short code fragment using REP MOVSx, sometimes even as several
instructions.
The memcpy(EDI, ESI, 15) equivalent is:
; copy 15 bytes from ESI to EDI
CLD ; set direction toforward
MOV ECX, 3
REP MOVSD ; copy 12 bytes
MOVSW ; copy 2 more bytes
MOVSB ; copy remaining byte
( Supposedly, it works faster than copying 15 bytes using just one REP MOVSB).
MOVSXload with sign extension see also: (1.17.1 on page 201)
MOVZXload and clear all other bits see also: (1.17.1 on page 202)
MOVload value. this instruction name is misnomer, resulting in some confusion (data is not moved
but copied), in other architectures the same instructions is usually named “LOAD” and/or “STORE”
or something like that.
One important thing: if you set the low 16-bit part of a 32-bit register in 32-bit mode, the high 16
bits remains as they were. But if you modify the low 32-bit part of the register in 64-bit mode, the
high 32 bits of the register will be cleared.
Supposedly, it was done to simplify porting code to x86-64.
MULunsigned multiply.IMULoften used instead ofMUL, read more about it:2.2.1.
NEGnegation:op=−opSame asNOT op / ADD op, 1.
NOPNOP. Its opcode is 0x90, it is in fact theXCHG EAX,EAXidle instruction. This implies that x86 does
not have a dedicatedNOPinstruction (as in manyRISC). This book has at least one listing where
GDB shows NOP as 16-bit XCHG instruction:1.8.1 on page 48.
More examples of such operations: (.1.7 on page 1038).
NOP may be generated by the compiler for aligning labels on a 16-byte boundary. Another very
popular usage ofNOPis to replace manually (patch) some instruction like a conditional jump toNOP
in order to disable its execution.
NOTop1:op1 =¬op 1. logical inversion Important feature—the instruction doesn’t change flags.
ORlogical “or”
POPget a value from the stack:value=SS:[ESP]; ESP=ESP+4 (or 8)
PUSHpush a value into the stack:ESP=ESP-4 (or 8); SS:[ESP]=value
RETreturn from subroutine:POP tmp; JMP tmp.
In fact, RET is an assembly language macro, in Windows and *NIX environment it is translated into
RETN (“return near”) or, in MS-DOS times, where the memory was addressed differently (11.6 on
page 1003), into RETF (“return far”).
RETcan have an operand. Then it works like this:
POP tmp; ADD ESP op1; JMP tmp.RETwith an operand usually ends functions in thestdcallcalling
convention, see also:6.1.2 on page 734.
SAHFcopy bits from AH to CPU flags: