The Process Address Space
Address Spaces
The process address space consists of the virtual memory addressable by a user-space process and the specific addresses within that virtual memory the process is permitted to use.
- Memory Layout: Each process is given a flat 32- or 64-bit address space, meaning addresses exist in a single continuous linear range.
- Process Isolation: The flat address space is unique to each process. A specific memory address in one process’s address space is completely unrelated to that same address in another process, unless processes intentionally share their address space as threads.
- Memory Areas: The address space is divided into intervals of legal addresses known as memory areas. Valid addresses exist in exactly one area; memory areas do not overlap. A process can dynamically add and remove these areas through the kernel.
- Permissions: Each memory area possesses associated permissions (readable, writable, executable) that the process must respect. Accessing invalid areas or violating permissions triggers a Segmentation Fault.
- Memory Area Contents: Memory areas contain mapped data, including:
- Text section: Executable file’s code.
- Data section: Initialized global variables.
- Bss section: Uninitialized global variables, mapped over the zero page.
- User-space stack: Mapped over the zero page.
- Shared libraries: Text, data, and bss sections for loaded libraries like the C library and dynamic linker.
- Other mappings: Memory mapped files, shared memory segments, and anonymous memory mappings.
To manage these distinct, non-overlapping memory areas for a process, the kernel employs a specific data structure known as the memory descriptor.
The Memory Descriptor (mm_struct)
The kernel represents a process’s address space with the memory descriptor, defined as struct mm_struct in <linux/mm_types.h>.
- Usage and Reference Counters:
mm_users: Tracks the number of processes currently using this address space. If two threads share the address space, this value is two.mm_count: The primary reference count for themm_struct. Allmm_userscollectively equate to one increment ofmm_count. The descriptor is only freed whenmm_countreaches zero, which occurs aftermm_usersreaches zero.
- Data Structures for Memory Areas:
mmap: Points to a singly linked list of all memory areas, allowing efficient traversal.mm_rb: Points to a red-black tree of all memory areas, allowing search efficiency.- Threaded Tree: Overlaying a linked list onto a tree to access the same underlying data is known as a threaded tree.
- Global List: All
mm_structstructures are linked in a doubly linked list via themmlistfield, starting withinit_mm(the init process) and protected bymmlist_lock.
Allocating and Destroying the Memory Descriptor
- Allocation: The descriptor is stored in the
mmfield of the process descriptor (task_struct). Duringfork(),copy_mm()copies the parent’s memory descriptor to the child. The structure is allocated from themm_cachepslab cache viaallocate_mm(). - Shared Address Spaces (Threads): If
CLONE_VMis specified during cloning,allocate_mm()is skipped, themm_userscount is incremented, and the new process’smmfield points directly to the parent’s memory descriptor. - Destruction: When a process exits,
exit_mm()is invoked. This function callsmmput()to decrementmm_users. Ifmm_usersreaches zero,mmdrop()is called to decrementmm_count. Ifmm_countreaches zero,free_mm()returns the structure to themm_cachepslab cache.
Kernel Threads and the Memory Descriptor
Kernel threads lack a user context and do not access user-space memory; therefore, their mm field is NULL.
- To access necessary kernel memory (like page tables) without wasting cycles switching address spaces or maintaining dedicated memory descriptors, kernel threads use the memory descriptor of the previously scheduled process.
- When scheduled, the kernel sets the
active_mmfield of the kernel thread’s process descriptor to point to the previous process’s memory descriptor.
The memory descriptor tracks all valid address intervals within the address space through discrete structures representing individual virtual memory areas.
Virtual Memory Areas (VMAs)
The memory area structure, vm_area_struct (defined in <linux/mm_types.h>), describes a single memory area over a contiguous interval in a given address space. The kernel treats each VMA as a unique memory object with specific permissions and operations.
- Address Interval:
vm_start: Initial (lowest) address in the interval, inclusive.vm_end: First byte after the final (highest) address in the interval, exclusive.
- Association: The
vm_mmfield points back to the associatedmm_struct. VMAs are unique to theirmm_struct, meaning multiple processes mapping the same file will each have a uniquevm_area_struct. - VMA Flags (
vm_flags): Specify behavior for the memory area as a whole, managed by the kernel rather than hardware.VM_READ,VM_WRITE,VM_EXEC: Standard read, write, and execute permissions.VM_SHARED: Identifies a shared mapping visible to multiple processes. If unset, it is a private mapping.VM_IO: Specifies a mapping of a device’s I/O space.VM_SEQ_READ/VM_RAND_READ: Hints for read-ahead behavior, set via themadvise()system call.
- VMA Operations (
vm_ops): Points to avm_operations_structcontaining methods to manipulate the VMA.open: Invoked when the area is added to an address space.close: Invoked when the area is removed from an address space.fault: Invoked by the page fault handler when accessing a page not present in physical memory.page_mkwrite: Invoked when a read-only page is made writable.
- Memory Area Representation in Real Life: VMAs map to the output seen in
/procorpmap(1). This output shows exact memory ranges, permissions, offsets, and backing files. Shared libraries are loaded into physical memory only once and mapped across multiple processes, resulting in substantial space savings. Bss sections and other uninitialized data map to the zero page, ensuring initialized regions of all zeros.
Because a process relies heavily on its allocated virtual memory areas, the kernel provides specific helper functions to locate and manipulate these structures efficiently.
Manipulating Memory Areas
The kernel frequently performs operations on memory areas, necessitating helper functions declared in <linux/mm.h> to search and evaluate VMAs.
find_vma(): Searches the given address space for the first VMA wherevm_end > addr.- The result is cached in the
mmap_cachefield of the memory descriptor to optimize consecutive operations. - If the cache misses, the function traverses the red-black tree (
mm_rb). - It returns the matching
vm_area_structorNULLif no such area exists.
- The result is cached in the
find_vma_prev(): Functions identically tofind_vma(), but additionally returns a pointer to the preceding VMA via a double pointer argument.find_vma_intersection(): Returns the first VMA that overlaps a given address interval by wrappingfind_vma()and ensuring the returned VMA does not start after the specified interval’s end address.
Beyond querying existing intervals, the kernel requires mechanisms to dynamically allocate new intervals or deallocate existing ones.
Creating and Removing Address Intervals
The kernel expands or reduces a process’s address space by explicitly creating or removing linear address intervals.
- Creating Intervals (
do_mmap):do_mmap()creates a new linear address interval. If the new interval is adjacent to an existing VMA and shares the same permissions, they are merged. Otherwise, a newvm_area_structis allocated from thevm_area_cachepslab cache.- The new memory area is linked into the address space’s linked list and red-black tree via
vma_link(), and thetotal_vmfield is updated. - Parameters include
addr(initial address),len(length),prot(page protection flags likePROT_READ), andflags(map type flags likeMAP_SHAREDorMAP_ANONYMOUS). - This functionality is exported to user-space via the
mmap2()system call (which receives the offset in pages to handle larger files).
- Removing Intervals (
do_munmap):do_munmap()removes a specified address interval starting atstartof lengthlenfrom a process address space.- This is exported to user-space via the
munmap()system call, which acts as a wrapper by grabbing themmap_semlock before executingdo_munmap().
To utilize these dynamically created virtual memory areas, the kernel must map the virtual addresses to physical memory using a hierarchical indexing system.
Page Tables
While applications operate on virtual memory, processors require physical addresses. Page tables convert virtual memory addresses to physical addresses by splitting the virtual address into chunks used as indices into hierarchical tables.
- Three-Level Architecture: Linux utilizes three levels of page tables to support a sparsely populated address space.
- Page Global Directory (PGD): The top-level table consisting of an array of
pgd_ttypes. Entries point to the PMD. - Page Middle Directory (PMD): The second-level table consisting of an array of
pmd_ttypes. Entries point to the PTE. - Page Table Entries (PTE): The final level consisting of
pte_ttypes. Entries point directly to physical pages.
- Page Global Directory (PGD): The top-level table consisting of an array of
- Management: Each process has distinct page tables. The
pgdfield of the memory descriptor points to the process’s PGD. Traversing and manipulating the page tables requires acquiring thepage_table_lock. - Translation Lookaside Buffer (TLB): To mitigate the performance overhead of resolving virtual-to-physical mappings, processors implement a hardware cache called the TLB. The processor first checks the TLB for the mapping; on a miss, it consults the page tables to retrieve the physical address.