Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 5: Locking and Interprocess Communication


systems connected via a network (this does not mean that they cannot be used to support communication
between two processes located on the same system).

Socket implementation is one of the rather more complex parts of the kernel because extensive abstraction
mechanisms are needed to hide the details of communication. From a user point of view, there is no
great difference whether communication is betweentwo local processes on the same system or between
applications running on computers located in different continents. The implementation of this amazing
mechanism is discussed in depth in Chapter 12.

During the development of kernel 2.6.26, the architecture-specific implementation of semaphores
has been replaced with a generic variant. Naturally, the generic implementation performs slightly
less efficient than optimized code, but since semaphores are not in very widespread use across the
kernel (mutexes are much more common), this does not really matter. The definition ofstruct
semaphorehas been moved toinclude/linux/semaphore.h, and all operations are implemented in
kernel/semaphore.c. Most importantly, the semaphore API has not changed, so existing code will run
without modifications.

Another change introduced during the development of kernel 2.6.26 concerns the implementation of
spinlocks. Because these locks are by definition supposed to be uncontended in the average case, the
kernel did not make any efforts to achieve fairness among multiple waiters, i.e., the order in which tasks
waiting for a spinlock to become unlocked are allowed to run after the lock is released by the current
holder was undefined. Measurements have, however, shown that this can lead to unfairness problems
on machines with a larger number of processors, e.g., 8-CPU systems. Since machines of this kind are not
uncommon anymore nowadays, the implementation of spinlocks has been changed such that the order
in which multiple waiters are allowed to obtain the lock is the same order in which they arrived. The API
was also left unchanged in this case, so existing code will again run without modifications.

5.5 Summary


While systems with more than one CPU were oddities only a few years ago, recent achievements of
semiconductor engineering have changed this drastically. Thanks to multi-core CPUs, SMP computers
are not only found in specialized niches like number crunching and supercomputing, but on the average
desktop. This creates some unique challenges for the kernel: More than one instance of the kernel can
run simultaneously, and this requires coordinated manipulation of shared data structures. The kernel
provides a whole set of possibilities for this purpose, which I have discussed in this chapter. They range
from simple and fast spinlocks to the powerful read-copy-update mechanism, and allow for ensuring
correctness of parallel operations while preserving performance. Choosing the proper solution is impor-
tant, and I have also discussed the need to select an appropriate design that ensures performance by
fine-grained locking, but does not lead to too much overhead on smaller machines.

Similar problems as in the kernel arise when userland tasks communicate with each other. Besides pro-
viding means that allow otherwise separated processes to communicate, the kernel must also make
means of synchronization available to them. I have discussed how the mechanisms originally invented
in System VUnixare implemented in the Linux kernel.
Free download pdf