Chapter 2: Process Management and Scheduling
❑ The kernel must be able to migrate processes from one CPU to another. However, this option
must be used with great care because it can severely impair performance. CPU caches are the
biggest problem on smaller SMP systems. Forreallybig systems, a CPU can be located literally
some meters away from the memory previously used, so access to it will be very costly.
The affinity of a task to particular CPUs is defined in thecpus_allowedelement of the task structure
specified above. Linux provides thesched_setaffinitysystem call to change this assignment.
Data Structures
The scheduling methods that each scheduler class must provide are augmented by two additional func-
tions on SMP systems:
<sched.h>
struct sched_class {
...
#ifdef CONFIG_SMP
unsigned long (*load_balance) (struct rq *this_rq, int this_cpu,
struct rq *busiest, unsigned long max_load_move,
struct sched_domain *sd, enum cpu_idle_type idle,
int *all_pinned, int *this_best_prio);
int (*move_one_task) (struct rq *this_rq, int this_cpu,
struct rq *busiest, struct sched_domain *sd,
enum cpu_idle_type idle);
#endif
...
}
Despite their names, the functions are, however, not directly responsible to handle load balancing. They
are called by the core scheduler code whenever the kernel deems rebalancing necessary. The scheduler
class-specific functions then set up an iterator that allows the generic code to walk through all processes
that are potential candidates to be moved to another queue, but the internal structures of the individual
scheduler classes mustnotbe exposed to the generic code because of the iterator.load_balanceemploys
the generic functionload_balance, whilemove_one_taskusesiter_move_one_task. The functions serve
different purposes:
❑ iter_move_one_taskpicks one task off the busy run queuebusiestand moves it to the run
queue of the current CPU.
❑ load_balanceis allowed to distribute multiple tasks from the busiest run queue to the current
CPU, but must not move more load than specified bymax_load_move.
How is load balancing initiated? On SMP systems, thescheduler_tickperiodic scheduler function
invokes thetrigger_load_balancefunction on completion of the tasks required for all systems as
described above. This raises theSCHEDULE_SOFTIRQsoftIRQ (the software analog to hardware interrupts;
see Chapter 14 for more details), which, in turn, guarantees thatrun_rebalance_domainswill be run in
due time. This function finally invokes load balancing for the current CPU by callingrebalance_domains.
The time flow is illustrated in Figure 2-25.