Linux Kernel Architecture

(Jacob Rumans) #1

Chapter3:MemoryManagement


❑ _PAGE_ACCESSEDis set automatically by the CPU each time the page is accessed. The kernel reg-
ularly checks the field to establish how actively the page is used (infrequently used pages are
good swapping candidates). The bit is set after either read or write access.
❑ _PAGE_DIRTYindicates whether the page is ‘‘dirty,’’ that is, whether the page contents have been
modified.
❑ _PAGE_FILEhas the same numerical value as_PAGE_DIRTY, but is used in a different context,
namely, when a page isnotpresent in memory. Obviously, a page that is not present cannot
be dirty, so the bit can be reinterpreted: If it is not set, then the entry points to the location of
a swapped-out page (see Chapter 18). A set_PAGE_FILEis required for entries that belongs to
nonlinear file mappings which are discussed in Section 4.7.3.
❑ If_PAGE_USERis set, userspace code is allowed to access the page. Otherwise, only the kernel is
allowed to do this (or when the CPU is in system mode).
❑ _PAGE_READ,_PAGE_WRITE,and_PAGE_EXECUTEspecify whether normal user processes are
allowed to read the page, write to the page, or execute the machine code in the page.
Pages from kernel memory must be protected against writing by user processes.
There is, however, no assurance that even pages belonging to user processes can be written to,
for example, if the page contains executable code that may not be modified — either intention-
ally or unintentionally.
Architectures that feature less finely grained access rights define the_PAGE_RWconstant to allow
or disallow read and write access in combination if no further criterion is available to distinguish
between the two.
❑ IA-32andAMD64provide_PAGE_BIT_NXto label the contents of a page asnot executable(this
protection bit is only available on IA-32 systems if the page address extensions for addressing
64 GiB memory are enabled). It can prevent, for example, execution of code on stack pages that
can result in security gaps in programs because of intentionally provoked buffer overflows
if malicious code has been introduced. The NX bit cannot prevent buffer overflow but can
suppress its effects because the process refuses to run the malicious code. Of course, the
same result can also be achieved if the architectures themselves provide a good set of access
authorization bits for memory pages, as is the case with some (unfortunately not very common)
processors.

Each architecture must provide two things to allow memory management to modify the additional bits
inpte_tentries — the data type__pgprotin which the additional bits are held, and thepte_modify
function to modify the bits. The above pre-processor symbols are used to select the appropriate entry.


The kernel also defines various functions to queryand set the architecture-dependent state of memory
pages. Not all functions can be defined by all processors because of lack of hardware support for a given
feature.


❑ pte_presentchecks if the page to which the page table entry points is present in memory. This
function can, for instance, be used to detect if a page has been swapped out.
❑ pte_dirtychecks if the page associated with the page table entry is dirty, that is, its contents
have been modified since the kernel checked last time. Note that this function may only be called
ifpte_presenthas ensured that the page is available.
❑ pte_writechecks if the kernel may write to the page.
Free download pdf