Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


requiring an inordinate amount of time to organize data structures? The simplest LRU variant uses a
(doubly) linked list with all the pages in the system. This list is resorted each time memory is accessed.
The page in question is found and moved to the start of the list. In the course of time, this results in a kind
of ‘‘equilibrium’’ in which frequently used pages are at the beginning of the list and least used pages are
right at the end (a similar algorithm is used to manage the buffer caches discussed in Chapter 16).

The algorithm works beautifully but can only be implemented effectively for a small number of elements.
This means that it cannot be used in its unadulterated form for memory management as this would be
far too costly in terms of system performance. Simpler implementations that consume less CPU time are
therefore required.

Special support by the processor makes implementation of the LRU algorithm significantly less costly.
Unfortunately, this support is available on few architectures and cannot be used by Linux; after all,
memory management should not be tailored to a specific processor type. A counter is then incremented
by 1 in each CPU period. After each page access, a further counter field for the page is set to the value
of the system counter. The processor itself must perform this action to ensure sufficient speed. If a page
fault occurs because a required page is no longer available, the operating system need only compare
the counters of all pages to ascertain which page was accessed the longest time ago. This technique still
necessitates searching through the list of all memorypages every time a page fault occurs but does not
require lengthy list operations after each memory access.

18.2 Page Reclaim and Swapping


in the Linux Kernel


This section summarizes the design decisions of the Linux page reclaim subsystem before considering
the technical aspects of implementation and examining how requirements are met.

Swapping out pages and all related actions do not appear to be very complicated when the situation
is viewed from the high ground without taking development details into consideration. Unfortunately,
the very opposite is the case. Hardly any part of the kernel entails as many technical difficulties as the
virtual memory subsystem, of which the implementation of swapping is a part. Not only a host of minor
hardware details but above all numerous interconnections in the kernel must be taken into account
if implementation is to succeed. Speed also plays a crucial role since system performance ultimately
depends on memory management performance. Notwithout reason is memory management one of the
hottest kernel development topics, which has given rise to countless discussions, flame wars, and rival
implementations.

When discussing the design of the swap subsystem, certain aspects come to mind as characterized by the
following questions:

❑ How should swap areas on block storage media be organized to ensure not only that pages
swapped out can be uniquely identified, but also that memory space is used as effectively as
possible to permit read and write operations at maximum speed?
❑ Which methods are available to enable the kernel to check when and how many pages
need to be swapped out in order to achieve the best possible balance between the provision of
Free download pdf