Linux Kernel Architecture

Chapter 18: Page Reclaim and Swapping

pagevec_addreturns the number of elements that are still freeafterthe new page has been added. __pagevec_lru_addis invoked ifNULLis returned, which indicates that the page vector is now com- pletely full after addition of the last element. This function places all pages in the page vector on the inactivelists of the zone to which the individual pages belong (the pages may all be associated with different zones). ThePG_lrubit is set for each page because they are now contained on an LRU list. The contents of the page vector are then deleted to make space for new pages in the cache.

If there are still free elements in the per-CPU list afterpagevec_addadded a page, thepageinstance is in the page vector, butnot yeton one of the system’s LRU lists.

lru_cache_add_activeworks in exactly the same way aslru_add_cachebut is used for active rather than inactive pages. It useslru_add_pvecs_activeas a buffer. When pages are transferred from the buffer to the active list, not just thePG_lrubit, but additionally thePG_activebit, is set.

lru_cache_addis required only inadd_to_page_cache_lrufrommm/filemap.cand adds a page to both the page cache and the LRU cache. This is, however, the standard function to introduce a new page both into the page cache and the LRU list. Most importantly, it is used bympage_readpagesand do_generic_mapping_read, the standard functions in which the block layer ends up when reading data from a file or mapping.

Usually a page is first regarded as inactive and has to earn its merits to be considered active. However, a selected number of procedures have a high opinion of their pages and invokelru_cache_add_activeto place pages directly on the zone’sactivelist^9 :

❑ read_swap_cache_asyncfrommm/swap_state.c; this reads pages from the swap cache. ❑ Thepagefaulthandlers__do_fault,do_anonymous_page,do_wp_page,anddo_no_page;these are implemented inmm/memory.c.

Understanding what is required to be promoted from an inactive to an active page is the subject of the next section. This is directly related to operations that move pages from the active to the inactive list and vice versa. Before these operations can be performed, it is necessary that the kernel transfer all pages from the per-CPU LRU caches to the global lists; otherwise, pages could be missed by the page-moving logics. The auxiliary functionlru_add_drainis provided for this purpose.

Finally, Figure 18-12 summarizes the movements between the different lists graphically.

18.6.3 Determining Page Activity

The kernel must track not only whether a page is actually used by one or more processes, but alsohow oftenit is accessed in order to assess its importance. As only very few architectures support a direct access counter for memory pages, the kernel must resort to other means and has therefore introduced two page flags namedreferencedandactive. The corresponding bit values arePG_referencedandPG_active,and the usual set of macros as discussed in Section 3.2.2 is available to set or receive the state. Recall that, for instance,PageReferencedchecks thePG_referencedbit, whileSetPageActivesets thePG_activebit.

Why are two flags used for the page state? Suppose that only a single flag were used to determine page activity —PG_activewould lend itself to that rather well. When the page is accessed, the flag is

(^9) The page migration code for NUMA systems, which is otherwise not covered in this book, is also a user of the function.

Linux Kernel Architecture

Chapter 18: Page Reclaim and Swapping

Get our desktop app

Company

Features

Documentation

Resources