Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


On NUMA machines, which do not share memory uniformly over all processors (see
Chapter 3), there is a separatekswapddaemon for each NUMA zone. Each daemon is
responsible for all memory zones in a NUMA zone.
On non-NUMA systems, there is just one instance ofkswapd, which is responsible
for all main memory zones (non-NUMA zones). Recall that, for instance, IA-32 can
have up to three zones — ISA-DMA, normal memory, and high memory.

The paths of the two versions merge very quickly in theshrink_zonefunction. The remaining code of
the page reclaim subsystem is identical for both options.

Once the number of pages to be swapped out in order to provide the system with fresh memory has been
determined — using algorithms designed to deal with acute memory shortage intry_to_free_pages
and to regularly check memory utilization in thekswapdaemon — the kernel must still decide which
specific pages are to be swapped out (and ultimately pass these from the policy part of the code to the
kernel routines responsible for writing the pages back to their backing store and adapting the page
table entries).

Recall from Chapter 3.2.1 that the kernel tries to categorize pages into two LRU lists: one for active pages,
and one for inactive pages. These lists are managed per memory zone:

<mmzone.h>
struct zone {
...
struct list_head active_list;
struct list_head inactive_list;
...
}

It is an essential job of the kernel to decide to which category a given page belongs, and a good proportion
of this chapter is devoted to answering this question.

The decision about how many pages and which pages are to be reclaimed is performed in the
following steps:


  1. shrink_zoneis the entry point for removing rarely used pages from memory and is called
    from within the periodical kswapd mechanism. This method is responsible for two things:
    It attempts to maintain a balance between the number of active and inactive pages in a
    zone by moving pages between the active and inactive lists (usingshrink_active_list).
    It also controls the release of a selectable number of pages by means ofshrink_cache.
    shrink_zoneacts as a go-between between the logic that defineshow manypages of a zone
    are to be swapped out and the decision as towhichpages these are.

  2. shrink_active_listis a comprehensive helper function used by the kernel to transfer
    pages between the active and inactive page lists. The function is informed of the number
    of pages to be transferred between the lists and then attempts to select the active pages
    least used.
    shrink_active_listis therefore essentially responsible for deciding which pages are sub-
    sequently swapped out and which are not. In other words, this is where the policy part of
    page selection is implemented.

Free download pdf