Linux Kernel Architecture

(Jacob Rumans) #1

Chapter3:MemoryManagement


PAGE_ALIGNis another standard macro that must be defined by each architecture (typically inpage.h). It
expects an address as parameter and ‘‘rounds‘‘ the address so that it is exactly at the start of the next page.
If the page size is 4,096, the macro always returns an integer multiple of this size;PAGE_ALIGN(6000) =
8192 = 2 ×4,096,PAGE_ALIGN(0x84590860) = 0x84591000=542,097×4,096. The alignment of addresses
topageboundariesisimportanttoensurethatbestuse is made of the cache resources of the processor.

Although C structures are used to represent entries in page tables, most consist of just a single
element — typicallyunsigned long— as an example of AMD64 architecture shows:^9

include/asm-x86_64/page.h
typedef struct { unsigned long pte; } pte_t;
typedef struct { unsigned long pmd; } pmd_t;
typedef struct { unsigned long pud; } pud_t;
typedef struct { unsigned long pgd; } pgd_t

structsare used instead of elementary types to ensurethat the contents of page table elements are
handled only by the associated helper functions and never directly. The entries may also be constructed
of several elementary variables. In this case, the kernel is obliged to use astruct.^10

The virtual address is split into several parts that are used as an index into the page
table in accordance with the familiar scheme. The individual parts are therefore less
than 32 or 64 bits long, depending on the word length of the architecture used. As
the excerpt from the kernel sources shows, the kernel (and therefore also the
processor) uses 32- or 64-bit types to represent entries in the page tables (regardless
of table level). This means that not all bits of a table entry are required to store the
useful data — that is, the base address of the next table. The superfluous bits are
used to hold additional information. Appendix A describes the structure of the page
tables on various architectures in detail.

PTE-SpecificEntries


Each final entry in the page table not only yields a pointer to the memory location of the page, but also
holds additional information on the page in the superfluous bits mentioned above. Although these data
are CPU-specific, they usually provide at least someinformation on page access control. The following
elements are found in most CPUs supported by the Linux kernel:

❑ _PAGE_PRESENTspecifies whether the virtual page is present in RAM memory. This need not
necessarily be the case because pages may be swapped out into a swap area as noted briefly in
Chapter 1.
The structure of the page table entry is usually different if the page is not present in memory
because there is no need to describe the position of the page in memory. Instead, information is
needed to identify and find the swapped-out page.

(^9) The definitions for IA-32 are similar. However, onlypte_tandpgd_t, which are defined asunsigned long, make an effective
contribution. I use the code example for AMD64 because it is more regular.
(^10) When IA-32 processors use PAE mode, they definepte_tas, for example,typedef struct { unsigned long pte_low,
pte_high; }. 32 bits are then no longer sufficient to address the complete memory because more than 4 GiB can be managed in
this mode. In other words, the available amount of memory can be larger than the processor’s address space.
Since pointers are, however, still only 32 bits wide, an appropriate subset of the enlarged memory space must be chosen for userspace
applications that do still only see 4 GiB each.

Free download pdf