Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 17: Data Synchronization


AsetI_LOCKbit in thestateelement of the inode data structure indicates that the element is already
being synchronized by another part of the kernel — and therefore cannot be modified at the moment in
the current path. If a regular writeback is active, this is not much of a problem: The kernel can simply
skip the inode and place it on thes_more_iolist, which guarantees that it will be reconsidered some time
later. Before returning to the caller,do_writepagesis used to write out some of the data associated with
the inode since this can do no harm.^4


The situation is more involved if a data integrity writeback is performed though. In this case, the kernel
doesnotskip the inode but sets up a wait queue (see Chapter 14) to wait until the inode is available again,
that is, until theI_SYNCbit is cleared. Notice that it is not sufficient to know that another part of the kernel
is already synchronizing the inode. This could be a regular writeback that does not guarantee that the
dirty data are actually written to disk. This is not whatWB_SYNC_ALLis about: When the synchronization
pass completes, the kernel has to guarantee that all data have been synchronized, and waiting on the
inode is therefore essential.


Once the inode is available, the job is passed on to__sync_single_inode. This extensive function writes
back the data associated with the inode and also the inode metadata. Figure 17-8 shows the code flow
diagram.


Lock inode

a_ops->do_writepages or generic_write_pages

WB_SYNC_ALL set?

Unlock inode

Place inode on apt list

wake_up_inode

__sync_single_inode

do_writepages

write_inode s_op->write_inode

filemap_fdatawait

inode_sync_complete

Figure 17-8: Code flow diagram for__sync_single_inode.


  1. First of all, the inode must be locked by setting theI_LOCKbit in the inode structure status
    field. This prevents other kernel threads from processing the inode.

  2. Synchronization of an inode consists of two parts: Synchronizing the data and synchronizing
    the metadata.


(^4) Actually, the call also does not have any benefit and will be removed in kernel 2.6.25, which was still under development when this
book was written. Sincedo_writepagesis also called in__sync_single_inodes, the call is superfluous.

Free download pdf