Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 8: The Virtual Filesystem


Most filesystems include thedo_sync_read^24 anddo_sync_writestandard routines in thereadand
writepointers of theirfile_operationsinstance.

The routines are strongly associated with other kernel subsystems (block layer and page cache in par-
ticular) and must also handle many potential flags and special situations. As a result, their implemen-
tation is not always the source of true clarity and sublimeness (comment in the kernel:this is really
ugly...). For this reason, I examine slightly simplified versions below; these focus on the main path
usually traversed so that important aspects are not obscured by a wealth of details. Nevertheless, I have
still found it necessary to make many references to routines in other chapters (and in other subsystems).

8.5.1 Generic Read Routine


generic_file_readis the library routine used by almost all filesystems to read data. It reads datasyn-
chronously; in other words, it guarantees that the desireddata are in memory when the function returns
to the caller. This is achieved by delegating the actual read operation to an asynchronous routine and
waiting until it ends. Slightly simplified, the function is implemented as follows:

mm/filemap.c
ssize_t do_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos)
{
struct iovec iov = { .iov_base = buf, .iov_len = len };
struct kiocb kiocb;
ssize_t ret;

init_sync_kiocb(&kiocb, filp);
kiocb.ki_pos = *ppos;
kiocb.ki_left = len;

ret = filp->f_op->aio_read(&kiocb, &iov, 1, kiocb.ki_pos);

if (-EIOCBQUEUED == ret)
ret = wait_on_sync_kiocb(&kiocb);
*ppos = kiocb.ki_pos;
return ret;
}

init_sync_kiocbinitializes akiocbinstance that controls the asynchronous I/O operation; its detailed
contents are of little interest here.^25 The real work is delegated to the filesystem-specific asynchronous
read operation that is stored inaio_readofstruct file_operations. Usuallygeneric_file_aio_read,
which I discuss shortly, is used. However, the routine performs work asynchronously, so there is no
guarantee that the data have already been read when the routine returns to the caller.

(^24) In former kernel versions, the standard read and write operations used to begeneric_file_readandgenericfile
write, but they have been replaced by the variants I am about to discuss.
(^25) Asynchronous I/O operations are used to submit a read or write request to the kernel. These requests are not satisfied immediately
but are queued in a list. The code flow then returns immediately to the calling function (in contrast to the regular I/O operations
implemented here). In this case, the calling function has the impression that the result is returned immediately because it does not
notice the delay involved in performing the operation. The data can be queried later after the request has been dealt with asyn-
chronously.
Asynchronous operations are not performed with file handles but with I/O control blocks. Consequently, an instance of the corre-
sponding data type must first be generated withinit_sync_kiocb. Currently, asynchronous I/O is used only by very few appli-
cations (e.g., large databases), so it’s not worth going into the details.

Free download pdf