Chapter3:MemoryManagement
the currently executing CPU. Work is then delegated to__alloc_pagesto which an appropriate set
of parameters is passed. Notice thatgfp_zoneis used to select the zone from which the allocation is
supposed to be fulfilled. This is an important detail that can easily be missed!
The kernel sources refer to this__alloc_pagesas the ‘‘heart of the buddy system‘‘ because it deals
with the essential aspects of allocation. Since a heart is an important thing to have, I shall introduce the
function in detail below.
SelectingPages
Let us therefore turn our attention to how page selection works.
Helper Functions
First, we need to define some flags used by the functions to control behavior when various watermarks
are reached.
mm/page_alloc.c
#define ALLOC_NO_WATERMARKS 0x01 /* don’t check watermarks at all */
#define ALLOC_WMARK_MIN 0x02 /* use pages_min watermark */
#define ALLOC_WMARK_LOW 0x04 /* use pages_low watermark */
#define ALLOC_WMARK_HIGH 0x08 /* use pages_high watermark */
#define ALLOC_HARDER 0x10 /* try to alloc harder */
#define ALLOC_HIGH 0x20 /* __GFP_HIGH set */
#define ALLOC_CPUSET 0x40 /* check for correct cpuset */
The first flags indicate which watermark applies when the decision is made as to whether pages can be
taken or not. By default (that is, there is noabsoluteneed for more memory because of pressure exerted
by other factors), pages are taken only when the zone still contains at leastzone->pages_highpages.
This corresponds to theALLOC_WMARK_HIGHflag.ALLOC_WMARK_MINor_LOWmust be set accordingly in
order to use the low (zone->pages_low) or minimum (zone->pages_min) setting instead.ALLOC_HARDER
instructs the buddy system to apply the allocation rules more generously when memory is urgently
needed;ALLOC_HIGHrelaxes these rules even more when highmem is allocated. Finally,ALLOC_CPUSET
tells the kernel to note that memory must be taken only from the areas associated with the CPUs that the
current process is allowed to use — of course, this option only makes sense on NUMA systems.
The flag settings are applied in thezone_watermark_okfunction, which checks whether memory can still
be taken from a given zone depending on the allocation flags set.
mm/page_alloc.c
int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
int classzone_idx, int alloc_flags)
{
/* free_pages my go negative - that’s OK */
long min = mark
long free_pages = zone_page_state(z, NR_FREE_PAGES) - (1 << order) + 1;
int o;
if (alloc_flags & ALLOC_HIGH)
min -= min / 2;
if (alloc_flags & ALLOC_HARDER)
min -= min / 4;