Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 8: The Virtual Filesystem


This does not seem overly useful at a first glance. What is a filesystem good for if it does not export
anything to userland? While files and directories are, indeed, one possible and without doubt useful
representation of the contents of a filesystem, they are not the only one. It is also perfectly valid to think
of a filesystem solely in terms of inodes! Files and directories are only a front end in this picture, and they
can be omitted without any loss of information.

Except visibility to userland. But this does not really concern the kernel. On some occasions, the need can
arise to internally group inodes together, and userland need not know anything about this. The kernel,
however, can benefit from organizing such collections in the form of filesystems because all standard
auxiliary functions that work for regular filesystems will automatically work for such collections as well.

Particular examples of pseudo-filesystems arebdevto manage inodes that represent block devices,
pipefsto handle pipes, andsockfsto deal with sockets. All appear in/proc/filesystems, but cannot
be mounted:

root@meitner #cat /proc/filesystems
...
nodev bdev
...
nodev sockfs
nodev pipefs
...
root@meitner #mount -t bdev bdev /mnt/bdev
mount: wrong fs type, bad option, bad superblock on bdev,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

The kernel provides the mount flagMS_NOUSERto prevent a filesystem from being mounted. Apart from
this, all filesystem mechanisms work as discussed in this chapter. The kernel can mount a pseudo-
filesystem withkern_mountorkern_mount_data. This ends up invfs_kern_mount, which integrates
the filesystem data into the VFS data structures.

When a filesystem is mounted from userland,do_kern_mountis not sufficient. Integration of the files
and directories into the user-visible representation is afterward handled bygraft_tree.Themethod,
however, refuses to perform its job if the flagMS_NOUSERis set:

fs/namespace.c
static int graft_tree(struct vfsmount *mnt, struct nameidata *nd)
{
...
if (mnt->mnt_sb->s_flags & MS_NOUSER)
return -EINVAL;
...
}

Nevertheless, structure and contents of the pseudo-filesystem are available to the kernel. The filesystem
library provides some means to write pseudo-filesystems with little effort, and I will come back to this in
Section 10.2.4.
Free download pdf