Assembly Language for Beginners

(Jeff_L) #1

1.30. 64 BITS


mov rdx, QWORD PTR a1$[rsp]
not r9
not rcx
and r13, r10
and r9, r11
and rcx, rdx
xor r9, rbx
mov rbx, QWORD PTR [rsp+72]
not rcx
xor rcx, QWORD PTR [rax]
or r9, rdx
not r9
xor rcx, r8
mov QWORD PTR [rax], rcx
mov rax, QWORD PTR out3$[rsp]
xor r9, r13
xor r9, QWORD PTR [rax]
xor r9, r8
mov QWORD PTR [rax], r9
pop r15
pop r14
pop r13
pop r12
pop rdi
pop rsi
ret 0
s1 ENDP


Nothing was allocated in the local stack by the compiler,x36is synonym fora5.


By the way, there are CPUs with much moreGPR’s, e.g. Itanium (128 registers).


1.30.2 ARM


64-bit instructions appeared in ARMv8.


1.30.3 Float point numbers.


How floating point numbers are processed in x86-64 is explained here:1.31 on the next page.


1.30.4 64-bit architecture criticism.


Some people has irritation sometimes: now one needs twice as much memory for storing pointers, includ-
ing cache memory, despite the fact that x64CPUs can address only 48 bits of externalRAM.


Pointers have gone out of favor to the point now where I had to flame about it because
on my 64-bit computer that I have here, if I really care about using the capability of my
machine I find that I’d better not use pointers because I have a machine that has 64-bit
registers but it only has 2 gigabytes of RAM. So a pointer never has more than 32 significant
bits to it. But every time I use a pointer it’s costing me 64 bits and that doubles the size
of my data structure. Worse, it goes into the cache and half of my cache is gone and that
costs cash—cache is expensive.
So if I’m really trying to push the envelope now, I have to use arrays instead of pointers.
I make complicated macros so that it looks like I’m using pointers, but I’m not really.

( Donald Knuth in “Coders at Work: Reflections on the Craft of Programming ”. )


Some people make their own memory allocators. It’s interesting to know about CryptoMiniSat^190 case.
This program rarely uses more than 4GiB ofRAM, but it uses pointers heavily. So it requires less memory
on 32-bit architecture than on 64-bit one. To mitigate this problem, author made his own allocator (in


(^190) https://github.com/msoos/cryptominisat/

Free download pdf