CHAPTER 18. ARRAYS CHAPTER 18. ARRAYS
{
return &month2[month][0];
};
Here is what we’ve get:
Listing 18.30: Optimizing MSVC 2013 x64
month2 DB 04aH
DB 061H
DB 06eH
DB 075H
DB 061H
DB 072H
DB 079H
DB 00H
DB 00H
DB 00H
...
get_month2 PROC
; sign-extend input argument and promote to 64-bit value
movsxd rax, ecx
lea rcx, QWORD PTR [rax+rax4]
; RCX=month+month4=month5
lea rax, OFFSET FLAT:month2
; RAX=pointer to table
lea rax, QWORD PTR [rax+rcx2]
; RAX=pointer to table + RCX2=pointer to table + month52=pointer to table + month10
ret 0
get_month2 ENDP
There are no memory accesses at all. All this function does is to calculate a point at which the first character of the name
of the month is:pointer_to_the_table+month∗ 10. There are also twoLEAinstructions, which effectively work as several
MULandMOVinstructions.
The width of the array is 10 bytes. Indeed, the longest string here—“September”—is 9 bytes, and plus the terminating zero
is 10 bytes. The rest of the month names are padded by zero bytes, so they all occupy the same space (10 bytes). Thus, our
function works even faster, because all string start at an address which can be calculated easily.
Optimizing GCC 4.9 can do it even shorter:
Listing 18.31: Optimizing GCC 4.9 x64
movsx rdi, edi
lea rax, [rdi+rdi*4]
lea rax, month2[rax+rax]
ret
LEAis also used here for multiplication by 10.
Non-optimizing compilers do multiplication differently.
Listing 18.32: Non-optimizing GCC 4.9 x64
get_month2:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
movsx rdx, eax
; RDX = sign-extended input value
mov rax, rdx
; RAX = month
sal rax, 2
; RAX = month<<2 = month4
add rax, rdx
; RAX = RAX+RDX = month4+month = month5
add rax, rax
; RAX = RAX2 = month52 = month10
add rax, OFFSET FLAT:month2
; RAX = month10 + pointer to the table