Chapter3:MemoryManagement
To this end, it is necessary to introduce new management mechanisms that place a greater overhead
on the kernel. To minimize the impact of this extra burden on system performance, the implementation
of the management layer should be as compact as possible so that there is little noticeable effect on the
caches and TLBs of the processor. At the same time, the kernel must ensure that memory is utilized
speedily and efficiently. Not only Linux but look-alikes and all other operating systems face this prob-
lem. Over the course of time, some good solutions and some bad solutions have been proposed and are
described in the general operating system literature (e.g., [Tan07]).
One such proposal — slab allocation — has proved to be very efficient for many workloads.
It was devised and implemented for Solaris 2.4 by Jeff Bonwick, a Sun employee. Because
he publicly documented his method [Bon94], it was also possible to implement a version
for Linux.
The provision of smaller memory blocks is not the only task of the slab allocator. Owing to its structure,
it also serves as acachefor objects that are frequently allocated and then released. By setting up a slab
cache, the kernel is able to keep a store of objects at the ready for subsequent use, even in an initialized
state, if so desired. For instance, the kernelmust frequently generate new instances ofstruct fs_struct
to manage the filesystem data associated with a process (see Chapter 8). The memory blocks occupied
by instances of this type are reclaimed just as often (when a process terminates). In other words, the
kernel tends to allocate and releasesizeof{fs_struct}memory blocks with great regularity. The slab
allocator keeps the returned memory blocks on an internal list and does not immediately give them back
to the buddy system. A recently returned block is then used when a new request is received for a fresh
instance of the object. This has two advantages. First, handling time is shorter because the kernel need
not apply the buddy system algorithms. Second, because the memory blocks are still ‘‘fresh,’’ there is a
strong probability that they are still in one of the CPU caches.
The slab allocator also has two further benefits:
❑ Calls to the buddy system are operations that have a considerable impact on the data and
instruction caches of the system. The more the kernel wastes these resources, the less they
are available for userspace processes. The more lightweight slab allocator dispenses with the
need for calls to the buddy system wherever possible and helps prevent undesirable cache
‘‘contamination.’’
❑ Data stored in pages delivered directly by the buddy system is always clustered around
addresses divisible by powers of 2 (many other allocation methods that divide pages into
smaller blocks share this characteristic). This has a negative impact on CPU cache utilization
because some cache lines are overused owing to this kind of address distribution and others
are almost empty. This disadvantage can be even more drastic on multiprocessor systems if
different memory addresses are transferred on different buses because some buses may be
congested, while others are little used.
By means ofslab coloring, the slab allocator is able to distribute objects uniformly to achieve uni-
form cache utilization, as demonstrated below.
That frequently used kernel objects are kept in the CPU cache is a desired effect. The earlier
comment that the large cache and TLB footprints of the buddy system are negative in terms
of the slab allocator related to the fact thatunimportantdata land in the CPU cache and
important data are displaced — a situation that should naturally be prevented.