Chapter3:MemoryManagement
Depending on the compile-time configuration, some zones need not be considered.
64-bit systems, for instance, do not require a high memory zone, and theDMA32zone
is only required on 64-bit systems that also support 32-bit peripheral devices that
can only access memory up to 4 GiB.
The kernel additionally defines a pseudo-zoneZONE_MOVABLE, which is required when efforts are made to
prevent fragmentation of the physical memory. We willlook closer into this mechanism in Section 3.5.2.
MAX_NR_ZONESacts as an end marker if the kernel wants to iterate over all zones present in the system.
Each zone is associated with an array in which the physical memory pages belonging to the
zone — known aspage framesin the kernel — are organized. An instance ofstruct pagewith the
required management data is allocated for each page frame.
The nodes are kept on a singly linked list so that the kernel can traverse them.
For performance reasons, the kernel always attemptsto perform the memory allocations of a process on
the NUMA node associated with the CPU on which it iscurrently running. However, this is not always
possible — for example, the node may already be full. For such situations,eachnode provides a fallback
list (with the help ofstruct zonelist). The list contains other nodes (and associated zones) that can be
used as alternatives for memory allocation. The further back an entry is on the list, the less suitable it is.
What’s the situation on UMA systems? Here, there is just a single node — no others. This node is shown
against a gray background in the figure. Everything else is unchanged.
3.2.2 Data Structures
Now that I have explained the relationship between the various data structures used in memory man-
agement, let’s look at the definition of each.
Node Management
pg_data_tis the base element used to represent a node and is defined as follows:
<mmzone.h>
typedef struct pglist_data {
struct zone node_zones[MAX_NR_ZONES];
struct zonelist node_zonelists[MAX_ZONELISTS];
int nr_zones;
struct page *node_mem_map;
struct bootmem_data *bdata;
unsigned long node_start_pfn;
unsigned long node_present_pages; /* total number of physical pages */
unsigned long node_spanned_pages; /* total size of physical page
range, including holes */
int node_id;
struct pglist_data *pgdat_next;
wait_queue_head_t kswapd_wait;
struct task_struct *kswapd;
int kswapd_max_order;
} pg_data_t;