Chapter3:MemoryManagement
Recall that page tables are used to establish an association between the virtual address spaces of user
processes and the physical memory of the system (RAM, page frames). The structures discussed so far
serve to describe the structure of RAM memory (partitioning into nodes and zones) and to specify the
number and state (used or free) of the page frames contained. Page tables are used to make a uniform
virtual address space available to each process; the applications see this space as a contiguous memory
area. The tables also map the virtual pages used intoRAM, thus supporting the implementation of shared
memory (memory shared by several processes at the same time) and the swapping-out of pages to a block
device to increase the effective size of usable memory without the need for additional physical RAM.
Kernel memory management assumes four-level page tables — regardless of whether this is the case
for the underlying processor. The best example where this assumption isnottrue is IA-32 systems. By
default, this architecture uses only a two-level paging system — assuming the PAE extensions are not
used. Consequently, the third and fourth levels must be emulated by architecture- specific code.
Page table management is split into two parts, the first architecture-dependent, the second architecture-
independent. Interestingly, all data structures and almost all functions to manipulate them are defined in
architecture-specific files. Because there are some big differences between CPU-specific implementations
(owing to the various CPU concepts used), I won’t go into the low-level details for the sake of brevity.
Extensive knowledge of the individual processors isalso required, and the hardware documentation for
each processor family is generally spread over several books. Appendix A describes the IA-32 architec-
ture in more detail. It also discusses, at least in summary form, the architecture of the other important
processors supported by Linux.
The descriptions of data structures and functionsin the following sections are usually based on the
interfaces provided by the architecture-dependent files. The definitions can be found in the header files
include/asm-arch/page.handinclude/asm-arch/pgtable.hreferred to in abbreviated form aspage.h
andpgtable.hbelow. Since AMD64 and IA-32 are unified into one architecture but exhibit a good many
differences when it comes to handling page tables, the definitions can be found in two different files:
include/asm-x86/page_32.handinclude/asm-x86/page_64.h, and similar forpgtable_XX.h.When
aspects relating to a specific architecture are discussed, I make explicit reference to the architecture. All
other information is equally valid for all architectures even if the definitions of the associated structures
are architecture-specific.
3.3.1 Data Structures
In C, thevoid*data type is used to specify a pointer to any byte positions in memory. The number of bits
required differs according to architecture. All common processors (including all those on which Linux
runs) use either 32 or 64 bits.
The kernel sources assume thatvoid*andunsigned longhave the same number of bits so that they can
be mutually converted by means of typecasts without loss of information. This assumption — expressed
formally assizeof(void*) == sizeof(unsigned long)— is, of course, true on all architectures sup-
ported by Linux.
Memory management prefers to use variables of typeunsigned longinstead ofvoidpointers because
they are easier to handle and manipulate. Technically, they are both equally valid.
Breakdownof Addresses in Memory
Addresses in virtual memory are split into five parts as required by the structure of the four-level
page tables (four table entries to select the page and an index to indicate the position within the page).