Linux Kernel Architecture

(Jacob Rumans) #1

Chapter3:MemoryManagement


❑ wait_table,wait_table_bits,andwait_table_hash_nr_entriesimplement a wait queue
for processes waiting for a page to become available. While the details of this mechanism are
shown in Chapter 14, the intuitive notion holds pretty well: Processes queue up in a line to wait
for some condition. When this condition becomes true, they are notified by the kernel and can
resume their work.
❑ The association between a zone and the parent node is established byzone_pgdat,whichpoints
to the corresponding instance ofpg_list_data.
❑ zone_start_pfnis the index of the first page frame of the zone.
❑ The remaining three fields are rarely used, so they’ve been placed at the end of the data struc-
ture.
nameis a string that holds a conventional name for the zone. Three options are available at
present:Normal,DMA,andHighMem.
spanned_pagesspecifies the total number of pages in the zone. However, not all need be usable
since there may be small holes in the zone asalready mentioned. A further counter (present_
pages) therefore indicates the number of pages that are actually usable. Generally, the value of
this counter is the same as that forspanned_pages.

Calculationof Zone Watermarks


Before calculating the various watermarks, the kernel first determines the minimum memory space
that must remain free for critical allocations. This value scales nonlinearly with the size of the available
RAM. It is stored in the global variablemin_free_kbytes. Figure 3-4 provides an overview of the scaling
behavior, and the inset — which does not use a logarithmic scale for the main memory size in contrast to
the main graph — shows a magnification of the region up to 4 GiB. Some exemplary values to provide a
feeling for the situation on systems with modest memory that are common in desktop environments are
collected in Table 3-1. An invariant is that not less than 128 KiB but not more than 64 MiB may be used.
Note, however, that the upper bound is only necessary on machines equipped with areallysatisfactory
amount of main memory.^3 The file/proc/sys/vm/min_free_kbytesallows reading and adapting the
value from userland.

Filling the watermarks in the data structure is handled byinit_per_zone_pages_min, which is invoked
during kernel boot and need not be started explicitly.^4

setup_per_zone_pages_minsets thepages_min,pages_low,andpages_highelements ofstruct zone.
After the total number of pages outside the highmem zone has been calculated (and stored inlowmem_
pages), the kernel iterates over all zones in the system and performs the following calculation:

mm/page_alloc.c
void setup_per_zone_pages_min(void)
{
unsigned long pages_min = min_free_kbytes >> (PAGE_SHIFT - 10);
unsigned long lowmem_pages = 0;
struct zone *zone;
unsigned long flags;

(^3) In practice, it will be unlikely that such an amount of memory is installed on a machine with a single NUMA node, so it will be
hard to actually reach the point where the cutoff is required.
(^4) The functions are not only called from here, but are also invoked each time one of the control parameters is modified via the proc
filesystem.

Free download pdf