Memory Management
The kernel virtualizes physical memory, providing each process with a unique, linear, and flat virtual address space. This virtualization isolates processes from hardware specifics and provides the foundation for advanced memory tracking, allocation, and lifecycle management.
The Process Address Space
Virtual memory is divided into pages, the smallest addressable unit managed by the Memory Management Unit (MMU), typically KB on 32-bit systems and KB on 64-bit systems,.
- Validity and Paging: Pages are either valid (associated with physical RAM or secondary storage) or invalid (unallocated). Accessing invalid pages generates a segmentation violation. Accessing valid pages stored on secondary storage triggers a page fault, prompting the kernel to page the data into physical RAM, potentially paging out older data to swap.
- Copy-on-Write (COW): Multiple virtual pages can map to a single physical page to share data. If a process attempts to write to a shared read-only page, the MMU intercepts the operation and the kernel transparently duplicates the page for the writing process, preserving the original for others,.
- Memory Regions: The virtual address space is partitioned into specific regions.
- Text Segment: Read-only program code, string literals, and constants.
- Stack: Dynamically sized execution stack for local variables and function returns.
- Data Segment (Heap): Writable, dynamically sized memory for runtime allocations.
- Bss Segment: Uninitialized global variables, highly optimized by the kernel mapping directly to a zero-filled COW page.
Virtual memory segments organize process data, but programs require explicit interfaces to actively manage the dynamic Data segment during execution.
Dynamic Memory Allocation
Applications allocate memory at runtime when requirements are unpredictable prior to execution.
malloc(size): Allocates a specified number of bytes. The returned memory is uninitialized and its contents are undefined.calloc(nr, size): Allocates memory for an array ofnrelements, eachsizebytes long,. Unlikemalloc(), it guarantees the memory is zero-filled.realloc(ptr, size): Resizes an existing memory block. It preserves existing data but may allocate a new region, copy the old data, and free the original pointer if the block cannot be expanded in place,.free(ptr): Returns dynamically allocated memory to the system,.- Allocation Hazards: Failing to call
free()causes a memory leak, permanently trapping resources. Accessing memory after it has been freed results in a use-after-free error via a dangling pointer.
Proper memory reclamation prevents exhaustion, but the returned pointers must also conform to strict hardware constraints known as alignment to guarantee safe operations.
Data Alignment
Processors and memory subsystems require data to be stored at memory addresses that correspond to multiples of the data’s size.
- Natural Alignment: A type of size bytes must reside at an address with its least-significant bits set to zero. Standard allocators automatically return addresses naturally aligned for any standard type.
- Structure Padding: Compilers insert invisible padding bytes between structure members to ensure each internal type meets its natural alignment requirements,.
- Strict Aliasing: An object may only be accessed through a pointer of its actual type or explicitly permitted variants (like
char *),. Dereferencing a cast pointer of an incompatible type violates strict aliasing and can cause corruption or processor exceptions,. - Custom Alignment:
posix_memalign()allocates memory aligned to a specific power-of-two boundary, which is necessary for operations like direct block I/O,.
While standard and aligned allocators orchestrate block distribution, they rely on low-level kernel interfaces to continuously expand or contract the boundaries of the underlying data segment.
Managing the Data Segment
The demarcation line dividing the end of the data segment from the unmapped address space is called the break point.
brk(end): Sets the absolute address of the break point.sbrk(increment): Shifts the break point by a relative delta, expanding or shrinking the heap.
Expanding the data segment directly leads to fragmentation, pushing modern allocators to bypass the heap entirely for large requests.
Anonymous Memory Mappings
Standard heap management relies on a buddy memory allocation scheme, which introduces internal fragmentation (waste from overallocating size to meet power-of-two blocks) and external fragmentation (free memory split into nonadjacent chunks). Furthermore, an active allocation sitting at the break point will pin all adjacent freed memory below it, preventing its return to the kernel.
- Bypassing the Heap: For allocations exceeding a specific threshold (typically 128 KB),
glibccompletely bypasses the data segment,. mmap(): Uses theMAP_ANONYMOUS | MAP_PRIVATEflags to create a standalone, zero-filled block of memory backed by nothing but virtual space,.- Trade-offs: Anonymous mappings eliminate fragmentation and return immediately to the kernel upon
munmap(). However, because their sizing is strictly bound to multiples of the system page size, using them for small allocations generates highly inefficient slack space waste.
To balance the speed of the heap against the cleanliness of anonymous mappings, the allocator provides tuning APIs to adjust thresholds at runtime.
Advanced Allocation Tuning and Debugging
Linux exposes internals of the allocation subsystem to allow applications to heavily optimize or debug memory operations.
mallopt(param, value): Modifies core memory management parameters.M_MMAP_THRESHOLDalters the byte limit where allocations switch from the heap to anonymous mappings.M_PERTURBenables memory poisoning, filling allocated and freed memory with specific bytes to rapidly crash programs exhibiting use-before-initialization or use-after-free errors,.malloc_usable_size(): Returns the true allocation size of a block, revealing any hidden overhead introduced by alignment rounding,.MALLOC_CHECK_: An environment variable that injects enhanced bounds checking and error handling intoglibcwithout recompilation, useful during debugging despite severe performance overhead-.
Even optimally tuned heap allocations incur tracking overhead, making extremely fast, untracked local allocations desirable for temporary data.
Stack-Based Allocations
Dynamic memory can be allocated directly on the execution stack instead of the heap or an anonymous mapping.
alloca(size): Dynamically reserves stack space by adjusting the stack pointer,. It executes significantly faster thanmalloc()and automatically reclaims the memory when the invoking function returns,.strdupa()/strndupa(): Specialized stack-based string duplicators,.- Variable-Length Arrays (VLAs): A C99 feature that allocates dynamically sized arrays on the stack. Unlike
alloca(), VLAs automatically free their memory the moment they fall out of scope, making them vastly superior inside loops to prevent stack exhaustion.
Whether allocated on the heap, via anonymous mappings, or on the stack, applications require low-level functions to inspect and modify the resulting blocks of bytes.
Manipulating Memory
The standard library provides optimized interfaces for operating on raw bytes irrespective of null-termination.
memset(s, c, n): Sets contiguous bytes to a specified value.memcmp(s1, s2, n): Evaluates memory blocks for equivalence,. Warning:memcmp()must never be used to compare structures, as uninitialized padding bytes between structure elements (introduced by alignment rules) will cause identical structures to register as disparate,.memmove()vs.memcpy(): Both copy bytes from source to destination.memmove()safely handles overlapping memory regions, whereasmemcpy()strictly forbids overlap but provides higher execution speed-.memchr()/memmem(): Scans memory blocks for occurrences of specific bytes or subblocks,.
After manipulating highly sensitive or time-critical memory blocks, applications must ensure the kernel’s paging mechanisms do not unexpectedly swap this data to disk.
Locking Memory
Because the kernel uses demand paging, unused virtual pages are swapped to disk. This introduces non-deterministic execution delays (page fault latency) and severe security vectors (cryptographic keys written unencrypted to physical storage),.
mlock(addr, len): Locks a targeted memory range into physical RAM, ensuring it is never paged out.mlockall(flags): Locks the entire process address space, with flags governing whether to lock all current pages, all future pages, or both,.- Constraints: To prevent malicious exhaustion of physical RAM, processes are constrained by the
RLIMIT_MEMLOCKresource limit unless they possess theCAP_IPC_LOCKcapability,. mincore(): Provides a vector mapping denoting exactly which pages in a given address range currently reside in physical memory.
The ability to lock memory physically highlights the fundamental tension of virtual memory: the kernel routinely promises processes more memory than physical hardware can fulfill.
Opportunistic Allocation and Overcommitment
When processes request memory, the kernel logs the commitment but defers assigning physical storage until the precise moment the process writes to the page.
- Overcommitment: This lazy allocation model allows the kernel to overcommit, granting programs more total virtual memory than the machine has physical RAM and swap combined,.
- Out of Memory (OOM): If processes actually utilize their granted commitments simultaneously and exhaust physical hardware, the kernel triggers an Out of Memory condition. The OOM killer is dispatched to heuristically terminate processes to free physical space.
- Strict Accounting: Administrators can disable the OOM killer by modifying
/proc/sys/vm/overcommit_memory. Setting this to enforces strict accounting, completely prohibiting overcommitment and guaranteeing that allmalloc()requests failing immediately if physical swap limits are exceeded,.