Chapter 2: Process Management and Scheduling
This method is only one way of triggering kernel preemption. Another possibility to activate preemption
is after a hardware IRQ has been serviced. If the processor returns to kernel mode after handling the IRQ
(return to user mode is not affected), the architecture-specific assembler routine checks whether the value
of the preemption counter is 0 — that is, if preemption is allowed — and whether the reschedule flag
is set — exactly as inpreempt_schedule. If both conditions are satisfied, the scheduler is invoked, this
time viapreempt_schedule_irqto indicate that the preemption request originated from IRQ context.
The essential difference between this function andpreempt_scheduleis thatpreempt_schedule_irqis
called with IRQs disabled to prevent recursive calls for simultaneous IRQs.
As a result of the methods described in this section, a kernel with enabled preemption is able to replace
processes with more urgent ones faster than a normal kernel could.
Low Latency
Naturally, the kernel is interested in providing good latency times even if kernel preemption is not
enabled. This can, for instance, be important in network servers. While the overhead introduced by
kernel preemption is not desired in such an environment, the kernel should nevertheless respond to
important events with reasonable speed. If, for example, a network request comes in that needs to be
serviced by a daemon, then this should not be overly long delayed by some database doing heavy I/O
operations. I have already discussed a number of measures offered by the kernel to reduce this problem:
scheduling latency in CFS and kernel preemption. Real-time mutexes as discussed in Chapter 5 also aid
in solving the problem, but there is one more scheduling-related action that can help.
Basically, long operations in the kernel should not occupy the system completely. Instead, they should
check from time to time if another process has become ready to run, and thus call the scheduler to select
the process. This mechanism is independent of kernel preemption and will reduce latency also if the
kernel is built without explicit preemption support.
The function to initiate conditional rescheduling iscond_resched. It is implemented as follows:
kernel/sched.c
int __sched cond_resched(void)
{
if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE))
__cond_resched();
return 1;
}
return 0;
}
need_reschedchecks if theTIF_NEED_RESCHEDflag is set, and the code additionally ensures that the ker-
nel is not currently being preempted already^34 and rescheduling is thus allowed. Should both conditions
be fulfilled, then__cond_reschedtakes care of the necessary details to invoke the scheduler.
How cancond_reschedbe used? As an example, consider the case in which the kernel reads in memory
pages associated with a given memory mapping. This could be done in an endless loop that terminates
after all required data have been read:
for (;;)
/* Read in data */
if (exit_condition)
continue;
(^34) Additionally, the function also makes sure that the system is completely up and running, which is, for instance, not the case if the
system has not finished booting yet. Since this is an unimportant corner case, I have omitted the corresponding check.