Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


If the page is not in the swap cache, the kernel must not only cause the page to be read, but must also
initiate areadaheadoperation to read a few pages in anticipation:

❑ grab_swap_tokengrabs the swap token as described before.
❑ swapin_readaheadis responsible to perform the readahead. As a result, read requests are issued
not only for the desired page but also for a few pages in the adjacent slots. This requires rela-
tively little effort but speeds things up considerably because processes very often access the data
they need from memory sequentially. When this happens, the corresponding pages will have
already been read into memory by the readahead mechanism.
❑ read_swap_cache_asyncis called once more for the presently required page. As the func-
tion name indicates, the read operation is asynchronous. However, the kernel uses a trick
to ensure that the required data have been read in before further work is commenced.
read_swap_cache_asynclocks the page before a read request is submitted to the block layer.
When the block layer has finished the data transfer, the page is unlocked. Therefore, it is
sufficient to calllock_pageindo_swap_pageto lock the page — the operation will have to
wait until the block layer unlocks the page. Unlocking the page from the block layer’s side is,
however, a confirmation that the read request has been completed.

I take a look at the implementation of these two actions below.

Once the page has been swapped in (if necessary), the following points must be addressed regardless of
whether the page came from the page cache or had to be read from a block device.

Thepageisfirstmarkedwithmark_page_accessedso that the kernel regards it as accessed — recall the
state diagram in Figure 18-13 in this context. It is then inserted in the page tables of the process, and the
corresponding caches are flushed if necessary. Thereafter,page_add_anon_rmapis invoked to include
the page in the reverse mapping mechanism discussed in Chapter 4. The familiarswap_freefunction
then checks whether the slot in the swap area can be freed. This also ensures that the usage counter in
the swap data structure is decremented by 1. If the slot is no longer needed, the routine modifies the
lowest_bitorhighest_bitfields of theswap_infoinstance provided the swap page is at one of its
two ends.

If the page is accessed in Read/Write mode, the kernel must conclude the operation by invoking
do_wp_page. This creates a copy of the page, adds it to the page tables of the process that caused the
fault, and decrements the usage counter on the original page by 1. These are the same steps performed
by the copy-on-write mechanism discussed in Chapter 4.

18.8.2 Reading the Data


Two functions read data from swap space into system RAM.read_swap_cache_asynccreates the nec-
essary preconditions and performs additional management tasks, andswap_readpageis responsible
for submitting the actual read request to the blocklayer. Figure 18-20 shows the code flow diagram for
read_swap_cache_async(assume that no errors occur during page allocation or because of race condi-
tions when reading in swapped-out pages).

find_get_pageis first invoked to check whether the pageis in the swap cache. This can be the case
because the readahead operations could have already provided the page. It’s good if the page is already
here because this simplifies things: The desired page can immediately be returned.
Free download pdf