Chapter3:MemoryManagement
for_each_online_node(nid) {
pg_data_t *pgdat = NODE_DATA(nid);
build_zonelists(pgdat);
...
}
return 0;
}
for_each_online_nodeiterates over all active nodes in the system. As UMA systems have only one
node,build_zonelistsis invoked just once to create the zone lists for the whole of memory. NUMA
systems must invoke the function as many times as there are nodes; each invocation generates the zone
data for a different node.
build_zonelistsexpects as parameter a pointer to apgdat_tinstance containing all existing information
on the node memory configuration and holding the newly created data structures.
On UMA systems,NODE_DATAreturns the address ofcontig_page_data.
The task of the function is to establish a ranking order between the zones of the node currently being
processed and the other nodes in the system; memory is then allocated according to this order. This is
important if no memory is free in the desired node zone.
Let us look at an example in which the kernel wants to allocate high memory. It first attempts to find a
free segment of suitable size in the highmem area of the current node. If it fails, it looks at the regular
memory area of the node. If this also fails, it tries to perform allocation in the DMA zone of the node. If it
cannot find a free area in any of the three local zones, it looks at other nodes. In this case, the alternative
node should be as close as possible to the primary node to minimize performance loss caused as a result
of accessing non-local memory.
The kernel defines a memory hierarchy and first tries to allocate ‘‘cheap‘‘ memory. If this fails, it gradually
tries to allocate memory that is ‘‘more costly‘‘ in terms of access and capacity.
The high memory (highmem) range is cheapest because no part of the kernel depends on memory allo-
cated from this area. There is no negative effect on the kernel if the highmem area is full — and this is
why it is filled first.
The situation in regular memory is different. Many kernel data structuresmustbe held in this area and
cannot be kept in highmem. The kernel is therefore faced with a critical situation if regular memory is
completely full — as a result, memory is not allocated from this area until there is no free memory in the
less critical highmem area.
Most costly is the DMA area because it is used for data transfer between peripherals and the system.
Memory allocation from this area is therefore a last resort.
The kernel also defines a ranking order among thealternative nodes as seen by the current memory
nodes. This helps determine an alternative node when all zones of the current node are full.
The kernel uses an array ofzonelistelements inpg_data_tto represent the described hierarchy as a
Data Structures