Chapter3:MemoryManagement
start_kernel
setup_arch
setup_per_cpu_areas
build_all_zonelists
mem_init
setup_per_cpu_pageset
Memory Management in the Kernel
Initialization of Memory Management
Let’s take a closer look at the functions invoked in the sections below after first summarizing their tasks
as follows:
❑ setup_archis an architecture-specific set-up function responsible for, among other things, ini-
tialization of the boot allocator.
❑ On SMP systems,setup_per_cpu_areasinitializes per-CPU variables defined statically
inthesourcecode(usingtheper_cpumacro) and of which there is a separate copy for each
CPU in the system. Variables of this kind are stored in a separate section of the
kernel binaries. The purpose ofsetup_per_cpu_areasis to create a copy of these data for each
system CPU.
This function is a null operation on non-SMP systems.
Initializing the Zone and Node Data Structures
❑ mem_initis another architecture-specific function to disable the bootmem allocator and perform
the transition to the actual memory management functions, as discussed shortly.
❑ kmem_cache_initinitializes the in-kernel allocator for small memory regions.
❑ setup_per_cpu_pagesetallocates memory for the first array element of thepagesetarrays from
struct zonementioned above. Allocating the first array element means, in other words, for the
first system processor. All memory zones of the system are taken into account.
The function is also responsible for setting the limits for the hot-n-cold allocator discussed at
length in Section 3.5.3.
Notice that thepagesetarrays members of other CPUs on SMP systems will be initialized when
they are activated.
Node and Zone Initialization
build_all_zonelistsbuilds the data structures required to manage nodes and their zones. Interest-
ingly, it can be implemented by the macros and abstraction mechanisms introduced above regardless of
whether it runs on a NUMA or UMA system. This works because the executed functions are available in
two flavors: one for NUMA systems and one for UMA systems.
Since this little trick is often used by the kernel, I will briefly discuss it. Suppose that a certain task
must be performed differently depending on the compile-time configuration. One possibility would