1.30. 64 BITS
ThenewR8-R15registersalsohavetheirlowerparts:R8D-R15D(lower32-bitparts),R8W-R15W(lower
16-bit parts),R8L-R15L(lower 8-bit parts).
Byte number:
7th 6th 5th 4th 3rd 2nd 1st 0th
R8
R8D
R8W
R8L
The number of SIMD registers was doubled from 8 to 16:XMM0-XMM15.
- In Win64, the function calling convention is slightly different, somewhat resembling fastcall (6.1.3
on page 735). The first 4 arguments are stored in theRCX,RDX,R8,R9registers, the rest βin the
stack. Thecallerfunction must also allocate 32 bytes so thecalleemay save there 4 first arguments
and use these registers for its own needs. Short functions may use arguments just from registers,
but larger ones may save their values on the stack.
System V AMD64 ABI (Linux, *BSD, Mac OS X)[Michael Matz, Jan Hubicka, Andreas Jaeger, Mark
Mitchell,System V Application Binary Interface. AMD64 Architecture Processor Supplement, (2013)]
(^189) also somewhat resembles fastcall, it uses 6 registersRDI,RSI,RDX,RCX,R8,R9for the first 6
arguments. All the rest are passed via the stack.
See also the section on calling conventions (6.1 on page 734).
- The C/C++inttype is still 32-bit for compatibility.
- All pointers are 64-bit now.
Since now the number of registers is doubled, the compilers have more space for maneuvering called
register allocation. For us this implies that the emitted code containing less number of local variables.
Forexample,thefunctionthatcalculatesthefirstS-boxoftheDESencryptionalgorithmprocesses32/64/128/256
values at once (depending onDES_typetype (uint32, uint64, SSE2 or AVX)) using the bitslice DES method
(read more about this technique here (1.29 on page 406)):
/*
- Generated S-box files.
- This software may be modified, redistributed, and used for any purpose,
- so long as its origin is acknowledged.
- Produced by Matthew Kwan - March 1998
*/
#ifdef _WIN64
#define DES_type unsigned __int64
#else
#define DES_type unsigned int
#endif
void
s1 (
DES_type a1,
DES_type a2,
DES_type a3,
DES_type a4,
DES_type a5,
DES_type a6,
DES_type out1,
DES_type out2,
DES_type out3,
DES_type out4
) {
DES_type x1, x2, x3, x4, x5, x6, x7, x8;
DES_type x9, x10, x11, x12, x13, x14, x15, x16;
DES_type x17, x18, x19, x20, x21, x22, x23, x24;
DES_type x25, x26, x27, x28, x29, x30, x31, x32;
DES_type x33, x34, x35, x36, x37, x38, x39, x40;
DES_type x41, x42, x43, x44, x45, x46, x47, x48;
(^189) Also available ashttps://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf