Page Tables and Virtual Memory
Paging Hardware and Address Translation
- RISC-V instructions manipulate virtual addresses, while the machine’s RAM uses physical addresses.
- The Sv39 RISC-V architecture utilizes only the bottom 39 bits of a 64-bit virtual address, ignoring the top 25 bits.
- The page table structure physically maps these addresses:
- Logically acts as an array of Page Table Entries (PTEs).
- Each PTE translates a virtual address to a physical address at the granularity of a 4096-byte ( bytes) page.
- A PTE contains a 44-bit Physical Page Number (PPN) and hardware control flags.
- The CPU constructs a 56-bit physical address by combining the 44-bit PPN from the PTE with the bottom 12 bits of the original virtual address.
- Three-level tree implementation:
- A page table is stored in physical memory as a three-level tree of 4096-byte pages.
- The root page contains 512 PTEs pointing to intermediate pages, which point to bottom-level pages containing the final physical mappings.
- The 27-bit virtual page number is split into three 9-bit sections to index into each of the three levels.
- This tree structure saves physical memory by omitting entirely unmapped intermediate and bottom-level page directories.
- Hardware integration:
- The Translation Look-aside Buffer (TLB) caches PTEs inside the CPU to eliminate the performance cost of loading PTEs from memory during every address translation.
- The
satpregister holds the physical address of the root page-table page, telling the CPU which page table tree to use for the currently executing thread.
- PTE flags control access permissions:
PTE_V: Indicates the PTE is present and valid.PTE_R,PTE_W,PTE_X: Control read, write, and execute permissions, respectively.PTE_U: Allows access by instructions executing in user mode.
To manage this hardware translation, the operating system constructs dedicated page tables to map distinct regions of memory, starting with its own address space.
Kernel Address Space
- The kernel maintains a single, globally shared page table to access physical memory and memory-mapped hardware registers.
- Direct mapping architecture:
- Most physical memory and device registers are mapped at virtual addresses exactly equal to their physical addresses.
- The kernel binary is located at
KERNBASE(0x80000000) in both virtual and physical memory spaces. - RAM extends linearly from
KERNBASEup to a defined limit calledPHYSTOP.
- Exceptions to direct mapping:
- Trampoline page: Mapped at the very top of the virtual address space, containing code to transition the CPU in and out of the kernel. This physical page is mapped twice: once direct-mapped in the kernel and once at the highest virtual address.
- Kernel stacks: Each process has a private kernel stack mapped high in memory, immediately above an unmapped guard page.
- The guard page has an invalid PTE (
PTE_Vis cleared), triggering a hardware exception if a stack overflow occurs, which prevents the corruption of adjacent kernel memory.
Building these kernel and user mappings requires specific software routines to walk the hardware-defined page table tree and update the entries.
Page Table Management Code
- The central data structure for software page table manipulation is
pagetable_t, a C pointer to a RISC-V root page-table page. - Core virtual memory lookup functions:
walk: Mimics the hardware’s 3-level traversal, using 9 bits at a time to descend the tree and return the address of the lowest-level PTE. It can dynamically allocate intermediate pages if requested during the traversal.mappages: Installs PTEs for a virtual-to-physical address range by callingwalkfor each page interval and configuring the PPN and permission flags.
- Kernel initialization routines:
kvminitcreates the kernel page table during early boot, mapping the kernel instructions, data, physical memory up toPHYSTOP, and device memory.kvminithartwrites the root page table physical address into the CPU’ssatpregister to enable hardware address translation.- The
sfence.vmainstruction is executed immediately aftersatpis modified to flush the CPU’s TLB, preventing stale cached mappings from causing invalid memory accesses.
Backing these virtual mappings requires an allocator to supply and reclaim the actual physical memory pages.
Physical Memory Allocation
- The kernel manages physical memory between the end of the kernel binary and
PHYSTOPas a global pool for run-time allocation. - Memory is allocated and freed strictly in 4096-byte page increments.
- Free pages are tracked using a linked list threaded directly through the available memory pages themselves.
- Allocator implementation:
- Each free page stores a
struct runstructure containing a pointer to the next free page. - The
kfreefunction fills freed memory with the garbage value1to expose dangling references quickly, then prepends the page to the free list. - The
kallocfunction removes and returns the first element from the free list when memory is requested. - The free list structure is protected by a spin lock to handle concurrent allocation requests across multiple CPUs.
- Each free page stores a
By combining physical memory allocation with page table construction, the kernel can construct isolated, dynamic memory environments for each executing program.
Process Address Space and Memory Growth
- Each process possesses an independent page table, dictating a private address space that maps contiguous virtual addresses starting at zero to potentially non-contiguous physical pages.
- Address space layout:
- Grows upwards to
MAXVA, addressing up to 256 Gigabytes of virtual memory. - Ordered sequentially from zero: user instructions, global variables, user stack, and an expandable heap.
- The trampoline page is mapped at the top of the user address space to facilitate kernel transitions.
- An inaccessible guard page (
PTE_Uflag cleared) sits directly below the user stack to catch stack overflows via hardware page-fault exceptions.
- Grows upwards to
- Dynamic memory allocation (
sbrk):- The
sbrksystem call shrinks or grows a process’s memory. growprocinvokesuvmallocto acquire new physical pages viakallocand maps them usingmappages.uvmdeallocremoves memory by callinguvmunmap, which utilizeswalkto locate PTEs and passes the associated physical addresses back tokfree.- The user page table serves as the definitive kernel record of which physical pages are allocated to a process.
- The
The exact layout of a process’s memory is initially populated by parsing compiled binaries from the file system.
Exec and ELF Binary Loading
- The
execsystem call replaces an address space’s existing memory image with a new executable stored in the Executable and Linkable Format (ELF). - Initialization and parsing steps:
- Validates the file via a 4-byte magic number (
0x7F 'E' 'L' 'F'). - Allocates a blank page table via
proc_pagetable. - Parses ELF program section headers (
struct proghdr) to determine memory sizing and block alignments. - Allocates contiguous virtual memory per segment with
uvmallocand populates the pages directly from the file vialoadseg.
- Validates the file via a 4-byte magic number (
- Stack setup:
- Allocates a single stack page and a protective inaccessible guard page.
- Copies command-line argument strings and pointers to the top of the newly allocated stack, preparing
argcandargvfor the program’smainfunction.
- Security and commitment:
- Verifies that segment virtual addresses and sizes do not mathematically overflow a 64-bit integer, preventing malicious binaries from tricking the kernel into mapping data over kernel space.
- Retains the old address space until the entire new image is successfully built. If an error occurs during parsing or allocation, the partial new image is freed and
execreturns an error, safely preserving the original process state.
Would you like me to generate a set of flashcards to help you review and memorize these virtual memory concepts and RISC-V paging structures?