Linux Kernel Architecture

Chapter 16: Page and Buffer Cache

Implementation

The nodes of a radix tree are essentially represented by the following data structure:

<lib/radix_tree.c> #define RADIX_TREE_TAGS 2 #define RADIX_TREE_MAP_SHIFT (CONFIG_BASE_SMALL? 4 : 6) #define RADIX_TREE_MAP_SIZE (1UL << RADIX_TREE_MAP_SHIFT) #define RADIX_TREE_TAG_LONGS \ ((RADIX_TREE_MAP_SIZE + BITS_PER_LONG - 1) / BITS_PER_LONG)

struct radix_tree_node { unsigned int height; /* Height from the bottom */ unsigned int count; struct rcu_head rcu_head; void *slots[RADIX_TREE_MAP_SIZE]; unsigned long tags[RADIX_TREE_TAGS][RADIX_TREE_TAG_LONGS]; };

The layout of this data structure is also very simple.slotsis an array ofvoidpointers that — depending on the level in which the node is located — point to either data elements or further nodes.countholds the number of used array entries in the node. The array is filled with entries starting at the top, and unused entries have null pointers.

Each tree node can point to 64 further nodes (or leaves) as indicated in the definition of theslotarray inradix_tree_node. The direct consequence of this definition is that each node may have only an array size that is a power of two. Also, the size ofallradix elements may only be defined at compilation time (of course, the maximum number of elements in a tree can change at run time). This behavior is rewarded by speed gains.

Tagging

The information discussed so far — the address space and the page tree — does not, however, allow the kernel to make a direct distinction between the clean and dirty pages of a mapping. This distinction is essential when, for example, pages are to be written back to store changes permanently on the underlying block device. Earlier kernel versions provided additional lists of dirty and clean pages inaddress_space. In principle, the kernel could, of course, scan the entire tree and filter out the pages with the appropriate state, but this is obviously very time-consuming. For this reason, each node of the radix tree includes additional tagging information that specifies whether each page in the node has the property specified in the tag. For example, the kernel uses a tag to label nodes with dirty pages. Nodes without this tag can therefore be skipped during a scan for dirty pages. This approach is a compromise between simple, unified data structures (no explicit lists are needed to hold pages with different states) and the option of performing a quick search for pages with specific properties. Currently, two tags are supported:

PAGECACHE_TAG_DIRTYspecifies whether a page is dirty.

PAGECACHE_TAG_WRITEBACKindicates that the page is being written back at the moment.

Linux Kernel Architecture

Chapter 16: Page and Buffer Cache

Get our desktop app

Company

Features

Documentation

Resources