Writing a Simple Operating System — from Scratch

(Jeff_L) #1

CHAPTER 5. WRITING, BUILDING, AND LOADING YOUR


KERNEL 45


00000000 55 push ebp
00000001 89E5 mov ebp ,esp
00000003 83EC10 sub esp ,byte +0x10
00000006 C745FCBABA0000 mov dword [ebp -0x4],0xbaba
0000000D 8B45FC mov eax ,[ebp -0x4]
00000010 C9 leave
00000011 C3 ret

The only difference now is that we actually allocate a local variable,myvar, but
this provokes an interesting response from the compiler. As before, the stack frame is
established, but then we seesub esp, byte +0x10, which is subtracting 16 (0x10) bytes
from the top of the stack. Firstly, we have to (constantly) remind ourselves that the stack
grows downwards in terms of memory addresses, so in simpler terms this instructions
means, ’allocate another 16 bytes on the top of stack’. We are storing anint, which is
a 4-byte (32-bit) data type, so why have 16 bytes been allocated on the stack for this
variable, and why not usepush, which allocates new stack space automatically? The
reason the compiler manipulates the stack in this way is one of optimsation, since CPUs
often operate less efficiently on a datatype that is not aligned on memory boundaries that
are multiples of the datatype’s size [?]. Since C would like all variables to be properly
aligned, it uses the maximum datatype width (i.e. 16 bytes) for all stack elements, at
the cost of wasting some memory.
The next instruction,mov dword [ebp-0x4],0xbaba, actually stores our variable’s
value in the newly allocated space on the stack, but without usingpush, for the previously
given reason of stack efficiency (i.e. the size of the datatype stored is less than the stack
space reserved). We understand the general use of themovinstruction, but two things
that deserve some explanation here are the use ofdwordand[ebp-0x4]:



  • dwordstates explicitly that we are storing adouble word(i.e. 4 bytes) on the
    stack, which is the size of ourintdatatype. So the actual bytes stored would
    be0x0000baba, but without being explicit could easily be0xbaba(i.e. 2 bytes)
    or0x000000000000baba(i.e. 8 bytes), which, although the same value, have
    different ranges.

  • [ebp-0x4]is an example of a modern CPU shortcut calledeffective address com-
    putation[?], which is more impressive that the assembly code seems to reflect.
    This part of the instruction references an address that is calculatedon-the-flyby
    the CPU, based on the current address of registerebp. At a glance, we might
    think our assembler is manipulating a constant value, as it would if we wrote
    something like thismov ax, 0x5000 + 0x20, where our assembler would simply
    pre-process this intomov ax, 0x5020. But only once the code is run would the
    value of any register be known, so this definitely is not pre-processing; it forms a
    part of the CPU instruction. With this form of addressing the CPU is allowing
    us to do more per instruction cycle, and is good example of how CPU hardware
    has adapted to better suit programmers. We could write the equivalent, without
    such address manipulation, less efficiently in the following three lines of code:


mov eax , ebp ; EAX = EBP
sub eax , 0x4 ; EAX = EAX - 0x4
mov [eax], 0xbaba ; store 0xbaba at address EAX
Free download pdf