The Linux Programming Interface

234 Chapter 13

Correspondingly, for input, the kernel reads data from the disk and stores it in a kernel buffer. Calls to read() fetch data from this buffer until it is exhausted, at which point the kernel reads the next segment of the file into the buffer cache. (This is a simplification; for sequential file access, the kernel typically performs read-ahead to try to ensure that the next blocks of a file are read into the buffer cache before the reading process requires them. We say a bit more about read- ahead in Section 13.5.) The aim of this design is to allow read() and write() to be fast, since they don’t need to wait on a (slow) disk operation. This design is also efficient, since it reduces the number of disk transfers that the kernel must perform. The Linux kernel imposes no fixed upper limit on the size of the buffer cache. The kernel will allocate as many buffer cache pages as are required, limited only by the amount of available physical memory and the demands for physical memory for other purposes (e.g., holding the text and data pages required by running pro- cesses). If available memory is scarce, then the kernel flushes some modified buffer cache pages to disk, in order to free those pages for reuse.

Speaking more precisely, from kernel 2.4 onward, Linux no longer maintains a separate buffer cache. Instead, file I/O buffers are included in the page cache, which, for example, also contains pages from memory-mapped files. Nevertheless, in the discussion in the main text, we use the term buffer cache, since that term is historically common on UNIX implementations.

Effect of buffer size on I/O system call performance The kernel performs the same number of disk accesses, regardless of whether we perform 1000 writes of a single byte or a single write of a 1000 bytes. However, the latter is preferable, since it requires a single system call, while the former requires

Although much faster than disk operations, system calls nevertheless take an
appreciable amount of time, since the kernel must trap the call, check the validity
of the system call arguments, and transfer data between user space and kernel
space (refer to Section 3.1 for further details).
The impact of performing file I/O using different buffer sizes can be seen by
running the program in Listing 4-1 (on page 71) with different BUF_SIZE values.
(The BUF_SIZE constant specifies how many bytes are transferred by each call to
read() and write().) Table 13-1 shows the time that this program requires to copy a
file of 100 million bytes on a Linux ext2 file system using different BUF_SIZE values.
Note the following points concerning the information in this table:

z The Elapsed and Total CPU time columns have the obvious meanings. The User CPU and System CPU columns show a breakdown of the Total CPU time into, respectively, the time spent executing code in user mode and the time spent executing kernel code (i.e., system calls). z The tests shown in the table were performed using a vanilla 2.6.30 kernel on an ext2 file system with a block size of 4096 bytes.

When we talk about a vanilla kernel, we mean an unpatched mainline kernel. This is in contrast to kernels that are supplied by most distributors, which often include various patches to fix bugs or add features.

The Linux Programming Interface

Get our desktop app

Company

Features

Documentation

Resources