Linux Kernel Architecture

Chapter 13: System Calls

The processing of system calls is, of course, a classic situation in which the kernel is busy with the syn- chronous execution of a task assigned to it by an application. There are two reasons why the kernel has to access the address space of user applications:

❑ If a system call requires more than six different arguments, they can be passed only with the help of C structures that reside in process memory space. A pointer to the structures is passed to the system call by means of registers. ❑ Larger amounts of data generated as a side effect of a system call cannot be passed to the user process using the normal return mechanism. Instead, the data must be exchanged in defined memory areas. These must, of course, be located in userspace so that the user application is able to access them.

When the kernel accesses its own memory area, it can always be sure that there is a mapping between the virtual address and a physical memory page. The situation in userspace is different, as described in Chapter 3. Here, pages might be swapped out or not even be allocated.

The kernel may not therefore simply de-reference userspace pointers, but also must employ specific functions to ensure that the desired area resides inRAM. To make sure that the kernel complies with this convention, userspace pointers are labeled with the__userattribute to support automated checking by C check tools.^9

Chapter 3 discusses the functions used to copy data between userspace and kernel space. In most cases, these will becopy_to_userandcopy_from_user, but more variants are available.

13.3.3 System Call Tracing

Thestracetool developed to trace the system calls of processes using theptracesystem call is described in Section 13.1.1.

Implementation of thesys_ptracehandler routine is architecture-specific and is defined in arch/arch/kernel/ptrace.c. Fortunately, there are only minor differences between the code of the individual versions. I therefore provide a generalized description of how the routine works without going into architecture-specific details.

Before examining the flow of the system call in detail, it should be noted that this call is needed because ptrace— essentially a tool for reading and modifying values in process address space — cannot be used directly to trace system calls. Only by extracting the desired information at the right places can trace processes draw conclusions on which system calls have been made. Even debuggers such asgdbare totally reliant onptracefor their implementation.ptraceoffers more options than are really needed to simply trace system calls.

ptracerequires four arguments as the definition in the kernel sources shows^10 :

<syscalls.h> asmlinkage long sys_ptrace(long request, long pid, long addr, long data);

(^9) Linus Torvalds himself designed thesparsetool to find direct userspace pointer de-referencings in the kernel.
(^10) <syscalls.h>contains the prototypes for all architecture-independent system calls whose arguments are identical on all
architectures.

Linux Kernel Architecture

Chapter 13: System Calls

Get our desktop app

Company

Features

Documentation

Resources