Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 9: The Extended Filesystem Family


This approach permits the flexible storage of large and small files because the size of the area in which
data block pointers is stored can be varied dynamically as a function of the actual size of the file. The
inode itself is always of a fixed size, and additionaldata blocks needed for purposes of indirection are
allocated dynamically.

Let’s first take a look at the situation with a small file. The pointers stored directly in the inode are suffi-
cient to identify all data blocks, and the inode structure occupies little hard disk space because it contains
just a few pointers.

Indirection is used if the file is bigger and there aren’t enough primary pointers for all blocks. The filesys-
tem reserves a data block on the hard disk — not for file data but for additional block pointers. This
block is referred to as asingle indirect blockand can accept hundreds of additional block pointers (the
actual number varies according to the size of the block; Table 9-1 lists possible values for Ext2). The
inode must include a pointer to the first indirectionblock so that it can be accessed. Figure 9-4 shows that
in our example this pointer immediately follows the direct block pointers. The size of the inode always
remains constant; the space needed for the additional pointer block is of some consequence with larger
files but represents no additional overhead for small files.

Table 9-1: Block and File Sizes in the Second Extended Filesystem

Block size Maximum file size

1,024 16 GiB

2,048 256 GiB

4,096 2 TiB

The further progress of indirection is evident from the illustration. Adding to available space by means
of indirection must also come up against its limits when files get larger and larger. The next logical step is
therefore to use double indirection. Again, a hard disk block is reserved to store pointers to data blocks.
However, the latter do not store useful data but are arrays that hold pointers to other data blocks that, in
turn, store the useful file data.

Using double indirection dramatically increases manageable space per file. If a data block holds pointers
to 1,000 other data blocks, double indirection enables 1,000×1,000 data blocks to be addressed. Of course,
the method has a downside because access to large files is more costly. The filesystem must first find the
address of the indirection block, read a further indirection entry, look for the relevant block, and find
the pointer to the data block address. There is therefore a trade-off between the ability to handle files of
varying sizes and the associated reduction in speed (the larger the file, the slower the speed).

As Figure 9-4 shows, double indirection is not the end of the road. The kernel offers triple indirection to
represent reallygiganticfiles. This is an extension of the principle of simple and double indirection and is
not discussed here.

Triple indirection takes maximum file size to such heights that other kernel-side problems crop up,
particularly on 32-bit architectures. Because the standard library useslongvariables with a length of
Free download pdf