Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


When the area with the lowest priority is reached, the kernel starts searching again from the beginning
(i.e., in the area with the highest priority). The search is terminated if no free entry is found afterallswap
areas in the system have been traversed. The kernel is then not able to swap out the page, and this fact is
reported by returning the page number 0 to the calling code.


How are the slot bitmaps of the individual swap areas scanned? Empty entries are recognized because
their usage counters equal 0.scan_swap_maptherefore scans theswap_maparray of the relevant swap
partition for such entries, but this is made a little more difficult by swap clustering. A cluster consists of
SWAPFILE_CLUSTERcontiguous entries into which pages are written sequentially. The kernel first deals
with the situation in which there isnofree entry in the cluster. Since thisisrarelythecase,Ipostponea
discussion of the appropriate code until later.^6


mm/swapfile.c
static inline unsigned long scan_swap_map(struct swap_info_struct *si)
{
unsigned long offset, last_in_cluster;
...
if (unlikely(!si->cluster_nr)) {
/* Find new cluster*/
}

We assume thatsi->cluster_nris greater than 0, indicating that the current cluster still has free slots
(recall thatcluster_nrspecifies the number of free slots in the current cluster). Once the kernel has
ensured that the current offset does not exceed the limit set byswap_info->highest_bit,itchecks
whether the swap counter of the entry is 0 at the proposed position, indicating that the entry is available
for use:


mm/swapfile.c
...
si->cluster_nr--;
cluster:
offset = si->cluster_next;
if (offset > si->highest_bit)
lowest: offset = si->lowest_bit;
if (!si->highest_bit)
goto no_page;
if (!si->swap_map[offset]) {
if (offset == si->lowest_bit)
si->lowest_bit++;
if (offset == si->highest_bit)
si->highest_bit--;
si->inuse_pages++;
if (si->inuse_pages == si->pages) {
si->lowest_bit = si->max;
si->highest_bit = 0;
}
si->swap_map[offset] = 1;
si->cluster_next = offset + 1;
...
return offset;
}
...

(^6) The implementation still includes a few explicit scheduler calls, not reproduced here. They are executed to minimize kernel latency
times when the kernel spends too long searching for a free swap slot.

Free download pdf