Linux Kernel Architecture

Chapter 2: Process Management and Scheduling

Suppose that the virtual time of the run queue is given byfair_clock, while the waiting time of a process is stored inwait_runtime. To sort tasks on the red-black tree, the kernel uses the differencefair_clock - wait_runtime. Whilefair_clockis a measure for the CPU time a task would have gotten if scheduling were completely fair,wait_runtimeis a direct measure for the unfairness caused by the imperfection of real systems.

When a task is allowed to run, the interval during which it has been running is subtracted from wait_runtime. This way, it will move rightward in the time-ordered tree at some point, and another process will be the leftmost one — and is consequentlyselected to run. Notice, however, that the virtual clock infair_clockwill increase when the task is running. This effectively means that the share of CPU time that the task would have received in a perfectly fair system is deducted from the time spent executing on the real CPU. This slows degradation of unfairness: Decrementingwait_runtimeis equivalent to lowering the amount of unfairness received by the task, but the kernel must not forget that some portion of the time used to lower the unfairness would have belonged to the process in a completely fair world anyway. Suppose again that four processes sit on the run queue, and that a process has been waiting for 20 real seconds. Now it is allowed to run for 10 seconds:wait_runtimeis afterward 10, but since the process would have gotten 10/ 4 =2 seconds of this time span anyway, effectively only 8 time units account for the potentially new position on the run queue.

Unfortunately, this strategy is complicated by a number of real-world issues:

❑ Different priority levels for tasks (i.e.,nicevalues) must be taken into account, and more important processes must get a higher share of CPU time than less important ones. ❑ Tasks must not be switched too often because a context switch, that is, changing from one task to another, has a certain overhead. When switching happens too often, too much time is spent with exchanging tasks that is not available for effective work anymore. On the other hand, the time that goes by between task switches must not be too long because large unfairness values could accumulate in this case. Letting tasks run for too long can also lead to larger latencies than desired for multimedia systems.

We will see how the scheduler tackles these problems in the following discussion.

A good way to understand scheduling decisions is to activate scheduler statistics at compile time. This will generate the file/proc/sched_debug, which contains information on all aspects of the current state of the scheduler.

Finally, note that theDocumentation/directory contains some files that relate to various aspects of the scheduler. Keep in mind, however, that some of them still relate to the oldO(1) scheduler and are there- fore outdated!

2.5.2 Data Structures

The scheduler uses a series of data structures to sort and manage the processes in the system. How the scheduler works is closely linked with the design of these structures. Several components interact with each other in many ways, and Figure 2-13 provides a first overview of the connections.

Scheduling can be activated in two ways: either directly if a task goes to sleep or wants to yield the CPU for other reasons, or by a periodic mechanism that is run with constant frequency and that checks from time to time if switching tasks is necessary. I denote these two componentsgeneric schedulerorcore

Linux Kernel Architecture

Chapter 2: Process Management and Scheduling

Get our desktop app

Company

Features

Documentation

Resources