Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 18: Page Reclaim and Swapping


These lists can become extremely long. The two sample swap areas in files that each contain around
16,000 pages consist of, for example, 37 or even 76 block groups. The second extent mechanism
requirement — high search speed — is not always met by doubly linked lists since they may well
comprise hundreds of entries. It is, of course, extremely time-consuming to scan through such lists each
time the swap area is accessed.

The solution is relatively simple. An additional element calledcurr_swap_extentinswap_info_struct
is used to hold a pointer to the last element accessed in the extent list. Each new search starts from this
element. As access is often made to consecutive page slots, the searched block is generally found in this
or the next extent element.^2

If the search by the kernel is not immediately successful, the entire extent list must be scanned element-
by-element until the entry for the required block is found.

18.3.2 Creating a Swap Area


New swap partitions are not created directly by the kernel itself. This task is delegated to a userspace
tool (mkswap) whose sources are in theutil-linux-ngtool collection. Since creating a swap area is a
mandatory step that must be performed before swap memory can be used, let’s briefly analyze the mode
of operation of this utility.

The kernel need not provide any new system calls to support the creation of swap areas — after all,
it also does not provide any system calls to create regular filesystems, and this is clearly not a kernel
problem. The existing call variants for direct communication with block devices (or, in the case of a swap
file, with a file on a block device) are quite sufficient to organize the swap area in accordance with kernel
requirements.

mkswaprequires just one argument — the name of the device file of the partition or file in which the swap
area is to be created.^3 The following actions are performed:

❑ The size of the required swap area is divided by the page size of the machine concerned in order
to determine how many page frames can be accommodated.
❑ The blocks of the swap area are checked individually for read or write errors in order to find
defective areas. As the machine’s page size is used as the block size for swap areas, a defective
block always means that the swap area’s capacity is reduced by one page.
❑ A list with the addresses of all defective blocks is written to the first page of the swap area.

(^2) A comment in the kernel sources notes that measurements have demonstrated that on average only 0. 3 list operations are, in fact,
needed to create a mapping between a page slot and a block number.
(^3) Other parameters such as the explicit size of the swap area or the page size can be specified. However, in most cases, this is point-
less because these data can be calculated automatically and reliably. The authors ofmkswapdo not have a high opinion of users
who make their own explicit specifications, as the source code shows:
if (block_count) {
/ this silly user specified the number of blocks explicitly /
...
}

Free download pdf