Assembly Language for Beginners

(nextflipdebug2) #1
1.5. HELLO, WORLD!
Listing 1.30:main()returning a value ofuint64_ttype
#include <stdio.h>
#include <stdint.h>

uint64_t main()
{
printf ("Hello!\n");
return 0;
}

The result is the same, but that’s howMOVat that line looks like now:

Listing 1.31: Non-optimizing GCC 4.8.1 + objdump
4005a4: d2800000 mov x0, #0x0 // #0

LDP(Load Pair) then restores theX29andX30registers.

There is no exclamation mark after the instruction: this implies that the values are first loaded from the
stack, and only then isSPincreased by 16. This is calledpost-index.

AnewinstructionappearedinARM64:RET.ItworksjustasBX LR,onlyaspecialhintbitisadded,informing
theCPUthat this is a return from a function, not just another jump instruction, so it can execute it more
optimally.

Due to the simplicity of the function, optimizing GCC generates the very same code.

1.5.5 MIPS.


A word about the “global pointer”

One important MIPS concept is the “global pointer”. As we may already know, each MIPS instruction has
a size of 32 bits, so it’s impossible to embed a 32-bit address into one instruction: a pair has to be used
for this (like GCC did in our example for the text string address loading). It’s possible, however, to load
data from the address in the range ofregister− 32768 :::register+32767using one single instruction (because
16 bits of signed offset could be encoded in a single instruction). So we can allocate some register for
this purpose and also allocate a 64KiB area of most used data. This allocated register is called a “global
pointer” and it points to the middle of the 64KiB area. This area usually contains global variables and
addresses of imported functions likeprintf(), because the GCC developers decided that getting the
address of some function must be as fast as a single instruction execution instead of two. In an ELF file
this 64KiB area is located partly in sections .sbss (“smallBSS^44 ”) for uninitialized data and .sdata (“small
data”) for initialized data. This implies that the programmer may choose what data he/she wants to
be accessed fast and place it into .sdata/.sbss. Some old-school programmers may recall the MS-DOS
memory model11.6 on page 1003or the MS-DOS memory managers like XMS/EMS where all memory
was divided in 64KiB blocks.

This concept is not unique to MIPS. At least PowerPC uses this technique as well.

Optimizing GCC

Let’s consider the following example, which illustrates the “global pointer” concept.

Listing 1.32: Optimizing GCC 4.4.5 (assembly output)
1 $LC0:
2 ; \000 is zero byte in octal base:
3 .ascii "Hello, world!\012\000"
4 main:
5 ; function prologue.
6 ; set the GP:
7 lui $28,%hi(gnu_local_gp)
8 addiu $sp,$sp,-32
9 addiu $28,$28,%lo(
gnu_local_gp)
10 ; save the RA to the local stack:
11 sw $31,28($sp)


(^44) Block Started by Symbol

Free download pdf