Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 5: Locking and Interprocess Communication


A special feature of the BKL is that its lock depth is also counted. This means thatlock_kernel
can be invoked even when the kernel has already been locked. The matching unlocking operation
(unlock_kernel) must then be invoked the same number of times to unlock the kernel so that other
processors can enter it.

Although the BKL is still present at more than 1,000 points in the kernel, it is an obsolete concept whose
use is deprecated by kernel developers because it is a catastrophe in terms of performance and scalability.
New code shouldon no accountuse the lock but should instead adopt the finer-grained options described
above. Nevertheless, it will be a few more years before the BKL finally disappears — if ever at all.^5 The
kernel sources summarize the situation well:

lib/kernel_lock.c
/*
* lib/kernel_lock.c
*
* This is the traditional BKL - big kernel lock. Largely
* relegated to obsolescence, but used by various less
* important (or lazy) subsystems.
*/

Notice that SMP systems and UP systems with kernel preemption allow for preempting the big kernel
lock if the configuration optionPREEMPT_BKLis set, although I will not discuss this mechanism further.
While this helps to decrease kernel latency, it is not a cure for the problems created by the BKL and
should only be seen as an emergency measure that serves as good as possible as an interim solution.

5.2.8 Mutexes


Although semaphores can be used to implement the functionality of mutexes, the overhead imposed
by the generality of semaphores is often not necessary. Because of this, the kernel contains a separate
implementation of special-purpose mutexes that are not based on semaphores. Or, to be precise, the
kernel contains two implementations of mutexes: A classical variant, and real-time mutexes that allow
for solving priority inversion problems. I discuss both approaches in the following.

ClassicalMutexes


The basic data structure for classical mutexes is defined as follows:

<mutex.h>
struct mutex {
/* 1: unlocked, 0: locked, negative: locked, possible waiters */
atomic_t count;
spinlock_t wait_lock;
struct list_head wait_list;
};

The concept is rather simple:countis 1 if the mutex is unlocked. The locked case distinguishes between
two situations. If only a single process is using the mutex, thencountis set to 0. If the mutex is locked
and any processes are waiting on the mutex to be unlocked (and need be awoken when this happens),
countis negative. This special casing helps to speed up the code because in the usual case, no one will be
waiting on the mutex.

(^5) During the development of kernel 2.6.26, a special kernel tree whose purpose is to speed up BKL removal was created, and hope-
fully progress will be accelerated by this measure.

Free download pdf