Writing a Simple Operating System — from Scratch

CHAPTER 5. WRITING, BUILDING, AND LOADING YOUR

KERNEL 45

00000000 55 push ebp 00000001 89E5 mov ebp ,esp 00000003 83EC10 sub esp ,byte +0x10 00000006 C745FCBABA0000 mov dword [ebp -0x4],0xbaba 0000000D 8B45FC mov eax ,[ebp -0x4] 00000010 C9 leave 00000011 C3 ret

The only difference now is that we actually allocate a local variable,myvar, but
this provokes an interesting response from the compiler. As before, the stack frame is
established, but then we seesub esp, byte +0x10, which is subtracting 16 (0x10) bytes
from the top of the stack. Firstly, we have to (constantly) remind ourselves that the stack
grows downwards in terms of memory addresses, so in simpler terms this instructions
means, ’allocate another 16 bytes on the top of stack’. We are storing anint, which is
a 4-byte (32-bit) data type, so why have 16 bytes been allocated on the stack for this
variable, and why not usepush, which allocates new stack space automatically? The
reason the compiler manipulates the stack in this way is one of optimsation, since CPUs
often operate less efficiently on a datatype that is not aligned on memory boundaries that
are multiples of the datatype’s size [?]. Since C would like all variables to be properly
aligned, it uses the maximum datatype width (i.e. 16 bytes) for all stack elements, at
the cost of wasting some memory.
The next instruction,mov dword [ebp-0x4],0xbaba, actually stores our variable’s
value in the newly allocated space on the stack, but without usingpush, for the previously
given reason of stack efficiency (i.e. the size of the datatype stored is less than the stack
space reserved). We understand the general use of themovinstruction, but two things
that deserve some explanation here are the use ofdwordand[ebp-0x4]:

dwordstates explicitly that we are storing adouble word(i.e. 4 bytes) on the
stack, which is the size of ourintdatatype. So the actual bytes stored would
be0x0000baba, but without being explicit could easily be0xbaba(i.e. 2 bytes)
or0x000000000000baba(i.e. 8 bytes), which, although the same value, have
different ranges.

[ebp-0x4]is an example of a modern CPU shortcut calledeffective address com-
putation[?], which is more impressive that the assembly code seems to reflect.
This part of the instruction references an address that is calculatedon-the-flyby
the CPU, based on the current address of registerebp. At a glance, we might
think our assembler is manipulating a constant value, as it would if we wrote
something like thismov ax, 0x5000 + 0x20, where our assembler would simply
pre-process this intomov ax, 0x5020. But only once the code is run would the
value of any register be known, so this definitely is not pre-processing; it forms a
part of the CPU instruction. With this form of addressing the CPU is allowing
us to do more per instruction cycle, and is good example of how CPU hardware
has adapted to better suit programmers. We could write the equivalent, without
such address manipulation, less efficiently in the following three lines of code:

mov eax , ebp ; EAX = EBP sub eax , 0x4 ; EAX = EAX - 0x4 mov [eax], 0xbaba ; store 0xbaba at address EAX

Writing a Simple Operating System — from Scratch

Get our desktop app

Company

Features

Documentation

Resources