Linux Kernel Architecture

(Jacob Rumans) #1

Chapter3:MemoryManagement


Recall that migrate lists are the basis for the page mobility approach that is used to keep memory frag-
mentation as low as possible. Low memory fragmentation means that larger contiguous page blocks are
available even after the system has been running for a longer time. As discussed in Section 3.5.2, the
notion ofhowbig a larger block is given by the global variablepageblock_order, which defines the order
for a large block.


If it is required to break a block of free pages from another migration list, the kernel has to choose what
to do with the remaining pages. If the rest itself qualifies as a large block, it makes sense to transfer the
whole block to the migrate list of the allocation type to mitigate fragmentation.


The kernel is more aggressive about moving free pages from one migrate list to another if an allocation
is performed for reclaimable memory. Allocations of this type often appear in bursts, for instance, when
updatedbis running, and could therefore scatter many small reclaimable portions across all migrate lists.
To avoid this situation, remaining pages forMIGRATE_RECLAIMABLEallocations are always transferred to
the reclaimable migrate list.


The kernel implements the described policy as follows:


mm/page_alloc.c
/*
* If breaking a large block of pages, move all free
* pages to the preferred allocation list. If falling
* back for a reclaimable kernel allocation, be more
* agressive about taking ownership of free pages
*/
if (unlikely(current_order >= (pageblock_order >> 1)) ||
start_migratetype == MIGRATE_RECLAIMABLE) {
unsigned long pages;
pages = move_freepages_block(zone, page,
start_migratetype);

/* Claim the whole block if over half of it is free */
if (pages >= (1 << (pageblock_order-1)))
set_pageblock_migratetype(page,
start_migratetype);

migratetype = start_migratetype;
}
...

move_freepagestries to move thecompletepage block with 2pageblock_orderpages in which the current
allocation is contained to the new migrate list. However, only free pages (i.e., those with thePG_buddybit
set) are moved. Additionally,move_freepagesalso obeys zone boundaries, so the total number of pages
can be smaller than a complete large page block. If, however, more than one-half of a large page block is
free, thenset_pageblock_migratetypeclaims the complete block (recall that the function always works
on groups withpageblock_nr_pagespages).


Finally, the kernel can remove the page block from the list, and useexpandto place the unused parts of a
largerblockbackonthebuddysystem.


mm/page_alloc.c
/* Remove the page from the freelists */
list_del(&page->lru);
Free download pdf