Linux Kernel Architecture

(Jacob Rumans) #1

Chapter 4: Virtual Process Memory


Theget_unmapped_areafunction described in Section 4.5.4 is first invoked to find a suitable area for
the mapping in the virtual address space. Recall that the application may specify a fixed address for the
mapping, suggest an address, or leave the choice of address to the kernel.

calc_vm_prot_bitsandcalc_vm_flag_bitscombine the flags and access permission constants speci-
fied in the system call in a joint flag set that is easier to handle in the subsequent operations (theMAP_and
PROT_flags are ‘‘translated‘‘into flags with the prefixVM_).

mm/mmap.c
vm_flags = calc_vm_prot_bits(prot) | calc_vm_flag_bits(flags) |
mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;

What is most interesting is that the kernel includes the value ofdef_flagsin the flag set after removing it
from themm_structinstance of the currently running process.def_flagshas the value 0 orVM_LOCK.The
former brings about no change to the resulting flag set, whereasVM_LOCKmeans that pages subsequently
mapped in cannot be swapped out (the implementation of swapping is discussed in Chapter 18). To set
the value ofdef_flags, the process must issue themlockallsystem call, which uses the mechanism
described above to prevent all future mappings from being swapped out, even if this was not requested
explicitly by means of theVM_LOCKflag at creation time.

After the arguments have been checked and all required flags have been set up, the remaining work is
delegated tommap_region.Thefind_vma_preparefunction with which we are familiar from Section 4.5.3
is invoked to find thevm_area_structinstances of the predecessor and successor areas and the data for
the entry in the red-black tree. If a mapping already exists at the specified mapping point, it is removed
by means ofdo_munmap(as described in the section below).

vm_enough_memoryis invoked^9 if either theMAP_NORESERVEflag is not set or the value of the kernel
parametersysctl_overcommit_memory^10 is set toOVERCOMMIT_NEVER, that is, when overcommiting is
not allowed. The function chooses whether to allocate the memory needed for the operation. If it selects
against, the system call terminates with-ENOMEM.

(^9) Usingsecurity_vm_enough_memory, which calls__vm_enough_memoryover varying paths depending on the security
framework in use.
(^10) sysctl_overcommit_memorycan be set with the help of the/proc/sys/vm/overcommit_memory. Currently there are
three overcommit options. 1 allows an application to allocate as much memory as it wants, even more than is permitted by the
address space of the system. 0 means that heuristic overcommitting is applied with the result that the number of usable pages is
determined by adding together the pages in the page cache, the pages in the swap area, and the unused page frames; requests for
allocation of a smaller number of pages are permitted. 2 stands for the strictest mode, known asstrict overcommitting,inwhichthe
permitted number of pages that can be allocated is calculated as follows:
allowed = (totalram_pages - hugetlb) * sysctl_overcommit_ratio / 100;
allowed += total_swap_pages;
Heresysctl_overcommit_ratiois a configurable kernel parameter that is usually set to 50. If the total number of pages used
exceeds this value, the kernel refuses to perform further allocations.
Why does it make sense to allow an application to allocate more pages than can ever be handled in principle? This is sometimes
required for scientific applications. Some tend to allocatehugeamounts of memory without actually requiring it — but, in the opinion
of the application authors, it seems good to have itjust in case. If the memory will, indeed, never be used, no physical page frames
will ever be allocated, and no problem arises.
Such a programming style is clearly bad practice, but unfortunately this is often no criterion for the value of software. Writing clean
code is usually not rewarding in the scientific community outside computer science. There is only immediate interest that a program
works for a given configuration, while efforts to make programs future-proof or portable do not seem to provide immediate benefits
and are therefore often not valued at all.

Free download pdf