Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 2: Process Management and Scheduling


One really crucial point of the completely fair scheduler is that sorting processes on the red-black tree is
based on the following key:

kernel/sched_fair.c
static inline s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
return se->vruntime - cfs_rq->min_vruntime;
}

Elements with a smaller key will be placed more to the left, and thus be scheduled more quickly. This
way, the kernel implements two antagonistic mechanisms:


  1. When a process is running, itsvruntimewill steadily increase, so it will finally move right-
    ward in the red-black tree.
    Becausevruntimewill increasemore slowlyfor more important processes, they will also
    move rightward more slowly, so their chance to be scheduled is bigger than for a less impor-
    tant process — just as required.

  2. If a process sleeps, itsvruntimewill remain unchanged. Because the per-queue
    min_vruntimeincreases in the meantime (recall that it is monotonic!), the sleeper will be
    placed more to the left after waking up because the key gotsmaller.^28


In practice, both effects naturally happen simultaneously, but this does not influence the interpretation.
Figure 2-19 illustrates the different movement mechanisms on the red-black tree graphically.

min_vruntime

vruntime vruntime

Value increases

Value decreases
Position in the red-
black tree (more to
the left is better)
Figure 2-19: Influence of the per-entity and
per-queue virtual times on the placement of
processes in the red-black tree.

LatencyTracking


The kernel has a built-in notion of what it considers a good scheduling latency, that is, the interval
during which every runnabletask should run at least once.^29 It is given insysctl_sched_latency,which
can be controlled via/proc/sys/kernel/sched_latency_nsand defaults to, respectively, 20,000,000 ns
(nanoseconds) and 20 ms (milliseconds). A second control parameter,sched_nr_latency, controls the
number of active processes that are at most handled in one latency period. If the number of active pro-
cesses grows larger than this bound, the latency period is extended linearly.sched_nr_latencycan be
indirectly controlled viasysctl_sched_min_granularity, which can be set via/proc/sys/kernel/
sched_min_granularity_ns. The default value is 4,000,000 ns, that is, 4 ms, andsched_nr_latencyis

(^28) This is slightly different for short sleepers, but I consider this situation when I discuss the exact mechanism.
(^29) Caution: This has nothing to do with time slices, which were used by the old scheduler!

Free download pdf