Writing a Simple Operating System — from Scratch

(Jeff_L) #1

CHAPTER 5. WRITING, BUILDING, AND LOADING YOUR


KERNEL 43


$ld -o basic.bin -Ttext 0x0 --oformat binary basic.o
Note that, like the compiler, the linker can output executable files in various for-
mats, some of which may retain meta data from the input object files. This is useful
for executables that are hosted by an operating system, such as the majory of programs
we will write on a platform such as Linux or Windows, since meta data can be retained
to describe how those applications are to be loaded into memory; and for debugging
purposes, for example: the information that a process crashed at instruction address
0x12345678is far less useful to a programmer than information presented using redun-
dant, non-executable meta-data that a process crashed in functionmyfunction, file
basic.c, on line 3.
Anyhow, since we are interested in writing an operating system, it would be no good
trying to run machine code intermingled with meta data on our CPU, since unaware the
CPU will execute every byte as machine code. This is why we specify an output format
of (raw)binary.
The other option we used was-Ttext 0x0, which works in the same way as theorg
directive we used in our earlier assembly routines, by allowing us to tell the compiler
to offset label addresses in our code (e.g. for any data we specify in the code, such as
strings like‘‘Hello, World’’) to their absolute memory addresses when later loaded
to a specific origin in memory. For now this is not important, but when we come to load
kernel code into memory, it is important that we set this to the address we plan to load
to.
Now we have successfully compiled the C code into a raw machine code file, that
we could (once we have figured out how to load it) run on our CPU, so let’s see what
it looks like. Luckily, since assembly maps very closely to machine code instructions, if
you are ever given a file containing only machine code, you can easily disassemble it to
view it in assembly. Ah, yes; this is another benefit of understanding a little of assembly,
because you can potentially reverse-engineer any software that lands on you lap minus
the original source code, even more successfully if the developer left in some meta data
for you --- which they nearly always do. The only problem with disassmbling machine
code is that some of those bytes may have been reserved as data but will show up as
assembly instructions, though in our simple C program we didn’t declare any data. To
see what machine code the compiler actually generated from our C source code, run the
following command:


$ndisasm -b 32 basic.bin > basic.dis
The-b 32simply tells the disassembler to decode to 32-bit assembly instructions,
which is what our compiler generates. Figure XXX shows the assembly code generated
by gcc for our simple C program.


00000000 55 push ebp
00000001 89E5 mov ebp ,esp
00000003 B8BABA0000 mov eax ,0xbaba
00000008 5D pop ebp
00000009 C3 ret

So here it is: gcc generated some assembly code not too disimilar to that which we
have been writing ourselves already. The three columns output from the disassembler,

Free download pdf