Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


zoneis thestruct zoneinstance used to define the characteristic data of the memory zone. The layout
and meaning of the structure are discussed in Chapter 3. Three auxiliary functions are employed to find
a suitable zone:

❑ zone_is_all_unreclaimablechecks for the flagZONE_ALL_UNRECLAIMABLE.Thisissetifthe
zone is full of pinned pages because, for instance, all have been locked with the system call
mlock. In this case, the zone need not be considered for page reclaim. The flag is automatically
removed when at least one page in the zone is returned to the buddy system layer.
❑ populated_zonechecks if any pages are present in the zone at all.
❑ zone_watermark_okchecks if memory can still be taken from a zone. See Section 3.5.4, where I
have discussed this function.

zone->pages_highis the targeted value for the ideal number of free pages (low and minimum values
are defined bypages_lowandpages_min).

As soon as a zone with an unacceptable status is found, the kernel branches to thescanlabel and starts
scanning. However, it may well be that all zones are in order, in which case the kernel need do nothing
and immediately jumps to the end ofbalance_pgdat.

All LRU pages in the zones to be scanned are determined before scanning starts:

mm/vmalloc.c
for (i = 0; i <= end_zone; i++) {
struct zone *zone = pgdat->node_zones + i;
lru_pages += zone_page_state(zone, NR_ACTIVE)
+ zone_page_state(zone, NR_INACTIVE);
}
...

As the code flow diagram shows, the kernel iterates over all zones. The direction goes from highmem to
DMA. Two functions must be invoked for each zone (zones that are unpopulated or where all pages that
are pinned are skipped):

❑ shrink_zonestarts the mechanism for selecting and reclaiming RAM pages that was discussed
in Section 18.6.4.
❑ shrink_slabis invoked by the kernel to shrink caches for various data structures allocated
with the help of the slab system. Section 18.10 discusses this function. Although the page cache
accounts for the lion’s share of memory utilization, shrinking other caches — such as the dentry
or inode cache — can also achieve tangible effects.

If the kernel iterates over the zones and finds that they are all in an acceptable status, the outer loop
that iterates over all priorities can be terminated. Otherwise, thecongestion_waitfunction discussed in
Chapter 17 is called if pages have been scanned and the scan priority is belowDEF_PRIORITY - 2.This
prevents congestion of the block layer as a result of too many requests.

18.9.2 Swap-out in the Event of Acute Memory Shortage


Thetry_to_free_pagesroutine is invoked for rapid, unscheduled memory reclaim. Figure 18-22 shows
the code flow diagram for the function.
Free download pdf