Linux Kernel Architecture

(Jacob Rumans) #1

Chapter3:MemoryManagement


goto nopage;
}

out_of_memory(zonelist, gfp_mask, order);
goto restart;
}

Without going into the details of implementation, note thatout_of_memorypicks one task that the kernel
deems particularly guilty of reserving all the memory — and kills it. This, hopefully, will lead to a good
number of free pages, and the allocation is retried by jumping to the labelrestart. However, it is unlikely
that killing a process will immediately lead to a continuous range of more than 2PAGE_COSTLY_ORDER
pages (wherePAGE_COSTLY_ORDER_PAGESis usually set to 3), so the kernel spares one innocent task’s life
if such a big allocation was requested, does not perform out-of-memory killing, and admits failure by
jumping tonopage.

What happens if__GFP_NORETRYis set or the kernel is not allowed to use operations that might affect the
VFS layer? In this case, the size of the desired allocation comes in:

mm/page_alloc.c
...
do_retry = 0;
if (!(gfp_mask & __GFP_NORETRY)) {
if ((order <= PAGE_ALLOC_COSTLY_ORDER) ||
(gfp_mask & __GFP_REPEAT))
do_retry = 1;
if (gfp_mask & __GFP_NOFAIL)
do_retry = 1;
}
if (do_retry) {
congestion_wait(WRITE, HZ/50);
goto rebalance;
}
nopage:
if (!(gfp_mask & __GFP_NOWARN) && printk_ratelimit()) {
printk(KERN_WARNING "%s: page allocation failure."
" order:%d, mode:0x%x\n",
p->comm, order, gfp_mask);
dump_stack();
show_mem();
}
got_pg:
return page;
}

The kernel goes into an endless loop if the allocation size is less than 2PAGE_ALLOC_COSTLY_ORDER= 8
pages, or the__GFP_REPEATflag is set.GFP_NORETRYmust naturally not be set in both cases since the
caller does not like to retry the allocation in this case. The kernel branches back to therebalancelabel
that introduces theslow pathand remains there until a suitable memory chunk is finally found — with
reservations of this size, the kernel can assume that the endless loop won’t last all that long. Beforehand,
the kernel invokescongestion_waitto wait for the block layer queues to free up (see Chapter 6) so that
it has a chance to swap pages out.

The kernel also goes into the above endless loop if the desired allocation order is greater than 3 but the
__GFP_NOFAILflag is set — the flag does not allow failing on any account.
Free download pdf