1.5. HELLO, WORLD!
12 ; load the address of the puts() function from the GP to $25:
13 lw $25,%call16(puts)($28)
14 ; load the address of the text string to $4 ($a0):
15 lui $4,%hi($LC0)
16 ; jump to puts(), saving the return address in the link register:
17 jalr $25
18 addiu $4,$4,%lo($LC0) ; branch delay slot
19 ; restore the RA:
20 lw $31,28($sp)
21 ; copy 0 from $zero to $v0:
22 move $2,$0
23 ; return by jumping to the RA:
24 j $31
25 ; function epilogue:
26 addiu $sp,$sp,32 ; branch delay slot + free local stack
As we see, the $GP register is set in the function prologue to point to the middle of this area. TheRA
register is also saved in the local stack.puts()is also used here instead ofprintf(). The address of the
puts()function is loaded into $25 usingLWthe instruction (“Load Word”). Then the address of the text
string is loaded to $4 usingLUI(“Load Upper Immediate”) andADDIU(“Add Immediate Unsigned Word”)
instruction pair. LUIsets the high 16 bits of the register (hence “upper” word in instruction name) and
ADDIUadds the lower 16 bits of the address.
ADDIUfollowsJALR(haven’t you forgotbranch delay slotsyet?). The register $4 is also called $A0, which
is used for passing the first function argument^45.
JALR(“Jump and Link Register”) jumps to the address stored in the $25 register (address ofputs()) while
saving the address of the next instruction (LW) inRA. This is very similar to ARM. Oh, and one important
thing is that the address saved inRAis not the address of the next instruction (because it’s in adelay slot
and is executed before the jump instruction), but the address of the instruction after the next one (after
thedelay slot). Hence,P C+8is written toRAduring the execution ofJALR, in our case, this is the address
of theLWinstruction next toADDIU.
LW(“Load Word”) at line 20 restoresRAfrom the local stack (this instruction is actually part of the function
epilogue).
MOVEat line 22 copies the value from the $0 ($ZERO) register to $2 ($V0).
MIPS has aconstantregister, which always holds zero. Apparently, the MIPS developers came up with
the idea that zero is in fact the busiest constant in the computer programming, so let’s just use the $0
register every time zero is needed.
Another interesting fact is that MIPS lacks an instruction that transfers data between registers. In fact,
MOVE DST, SRCisADD DST, SRC, $ZERO(DST=SRC+ 0), which does the same. Apparently, the MIPS
developers wanted to have a compact opcode table. This does not mean an actual addition happens at
eachMOVEinstruction. Most likely, theCPUoptimizes these pseudo instructions and theALU^46 is never
used.
Jat line 24 jumps to the address inRA, which is effectively performing a return from the function.ADDIU
afterJis in fact executed beforeJ(rememberbranch delay slots?) and is part of the function epilogue.
Here is also a listing generated byIDA. Each register here has its own pseudo name:
Listing 1.33: Optimizing GCC 4.4.5 (IDA)
1 .text:00000000 main:
2 .text:00000000
3 .text:00000000 var_10 = -0x10
4 .text:00000000 var_4 = -4
5 .text:00000000
6 ; function prologue.
7 ; set the GP:
8 .text:00000000 lui $gp, (gnu_local_gp >> 16)
9 .text:00000004 addiu $sp, -0x20
10 .text:00000008 la $gp, (gnu_local_gp & 0xFFFF)
11 ; save the RA to the local stack:
12 .text:0000000C sw $ra, 0x20+var_4($sp)
13 ; save the GP to the local stack:
14 ; for some reason, this instruction is missing in the GCC assembly output:
(^45) The MIPS registers table is available in appendix.3.1 on page 1042
(^46) Arithmetic Logic Unit