Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


18.9.1 Periodic Reclaim with kswapd


kswapdis a kernel daemon that is activated bykswap_initeach time the system is started and continues
to execute for as long as the machine is running:

mm/vmscan.c
int kswapd_run(int nid)
{
pg_data_t *pgdat = NODE_DATA(nid);
int ret = 0;
...
pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid);
...
return ret;
}

static int __init kswapd_init(void)
{
pg_data_t *pgdat;

swap_setup();
for_each_node_state(nid, N_HIGH_MEMORY)
kswapd_run(nid);

return 0;
}

The code shows that a separate instance ofkswapdis activated for each NUMA zone. On some machines,
this serves to enhance system performance as different speeds of access to various memory areas are
compensated. Non-NUMA systems use only a singlekswapd, though.

More interesting is the execution of thekswapddaemon implemented inkswapdfrommm/vmscan.c.Once
the necessary initialization work has been completed,^15 the following endless loop is executed:

mm/vmscan.c
static int kswapd(void *p)
{
unsigned long order;
pg_data_t *pgdat = (pg_data_t*)p;
struct task_struct *tsk = current;
DEFINE_WAIT(wait);
...
current->reclaim_state = &reclaim_state;

tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
...
order = 0;
for(;;){
unsigned long new_order;

prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
new_order = pgdat->kswapd_max_order;
pgdat->kswapd_max_order = 0;

(^15) On NUMA systems,set_cpus_allowedrestricts execution of the daemon to processors associated with the memory zone.

Free download pdf