kmem is the allocator’s global state.
kmem.lockprotects allocator state from concurrent CPU accesskmem.freelistpoints to the linked list of currently free physical pages- Every
kallocandkfreecall updateskmem.freelistwhile holdingkmem.lock
The allocator stores free pages in a linked list.

kfree:
- Returns one physical page to the allocator.
- Requires a page-aligned address.
- Rejects addresses below
endor at/abovePHYSTOP. - Invalid pages cause
panic("kfree"). - Fills the page with
1bytes to expose dangling references. - Reuses the page itself as a
struct runfreelist node. - Pushes the page onto
kmem.freelistwhile holdingkmem.lock.
kalloc:
- Removes one physical page from the allocator.
- Pops the current
kmem.freelisthead while holdingkmem.lock. - Returns
0if the freelist is empty. - Fills the allocated page with
5bytes before returning it. - Returns a kernel-usable pointer to the page.
- Callers must not assume allocated pages are zeroed.
satp is the supervisor page-table register.
satptells this hart which page table to use for address translation.MAKE_SATP(pagetable)selects Sv39 mode and stores the root page-table physical page number.w_satp(...)writes the prepared value into the hart’ssatpregister.- After this write, addresses are translated through
kernel_pagetable.
sfence_vma flushes address-translation state.
- Flushes stale TLB entries on the current hart.
sfence.vma zero, zeromeans flush all address translations.- The first call orders earlier page-table writes before the switch.
- The second call removes translations cached before the switch.
from kvminit
Most mappings are direct: virtual address equals physical address.
| Virtual | Physical | Pages | Perm | Purpose |
|---|---|---|---|---|
UART0 | UART0 | 1 | R W | Console UART registers |
VIRTIO0 | VIRTIO0 | 1 | R W | Disk MMIO registers |
PLIC range | Same | 0x4000000 / PGSIZE | R W | Interrupt controller |
KERNBASE to etext | Same | (etext - KERNBASE) / PGSIZE | R X | Kernel text |
etext to PHYSTOP | Same | (PHYSTOP - etext) / PGSIZE | R W | Kernel data and RAM |
TRAMPOLINE | trampoline.S | 1 | R X | Trap entry and return |
KSTACK(p) | New page | 1 per process slot | R W | Process kernel stack |
| Stack guard page | Unmapped | 1 per process slot | None | Overflow trap |
TRAMPOLINE
- is not direct-mapped.
- a high virtual address that points to the physical page containing trampoline code.
- xv6 uses the same high virtual address in both kernel and user page tables, so trap entry and trap return can run while switching page tables.
KSTACK(p)
- gives each process slot a kernel stack.
- is used when that process is running in the kernel after a syscall, interrupt, or exception.
- a guard page is left unmapped next to each stack, so stack overflow causes a fault instead of overwriting another stack.
Page-table mechanics:
walk
- Follows the Sv39 page-table tree for one virtual address.
- Returns the level-0 PTE for that address.
- Allocates missing intermediate page-table pages when
allocis set.
mappages
- Creates PTEs for a virtual-to-physical range.
- Uses
walkto find each level-0 PTE. - Writes physical address, permission bits, and
PTE_Vinto each PTE.
walkaddr
- Translates a user virtual address to a physical address.
- Requires the page to be valid and user-accessible.
- Returns
0if the address is invalid or unmapped.
uvmunmap
- Removes mappings from a virtual range.
- Can also free the mapped physical pages.
- Leaves page-table pages themselves for
freewalk.
freewalk
- Recursively frees page-table pages.
- Assumes all leaf mappings are already removed.
- Panics if it finds a still-mapped leaf PTE.
ismapped
- Checks whether a virtual address has a valid PTE.
- Used before allocating lazy pages.
User address spaces:
uvmcreate: allocates an empty root page table for a process.uvmalloc: allocates physical pages and maps them as user memory grows.uvmdealloc: shrinks user memory by unmapping pages above the new size.uvmfree: frees mapped user memory, then frees the page-table tree.uvmcopy: copies mapped pages from parent to child duringfork.uvmclear: clearsPTE_Uon one page so user code cannot access it.
User/kernel boundary:
copyin: copies bytes from user virtual memory into kernel memory.copyout: copies bytes from kernel memory into user virtual memory.copyinstr: copies a null-terminated user string into kernel memory.vmfault: allocates and maps a missing lazy-allocation page on demand.
Page Tables
Page tables are the most popular mechanism through which the operating system provides each process with its own private address space and memory.
Paging Hardware
- RISC-V instructions manipulate virtual addresses, while the machine’s RAM uses physical addresses.
- The Sv39 RISC-V architecture utilizes only the bottom 39 bits of a 64-bit virtual address, ignoring the top 25 bits.
- The page table structure physically maps these addresses:
- Logically acts as an array of Page Table Entries (PTEs).
- Each PTE translates a virtual address to a physical address at the granularity of a 4096-byte ( bytes) page.
- A PTE contains a 44-bit Physical Page Number (PPN) and hardware control flags.
- The CPU constructs a 56-bit physical address by combining the 44-bit PPN from the PTE with the bottom 12 bits of the original virtual address.
RISC-V virtual and physical addresses, with a simplified logical page table:

- Three-level tree implementation:
- A page table is stored in physical memory as a three-level tree of 4096-byte pages.
- The root page contains 512 PTEs pointing to intermediate pages, which point to bottom-level pages containing the final physical mappings.
- The 27-bit virtual page number is split into three 9-bit sections to index into each of the three levels.
- This tree structure saves physical memory by omitting entirely unmapped intermediate and bottom-level page directories.
RISC-V address translation details:

- Hardware integration:
- The Translation Look-aside Buffer (TLB) caches PTEs inside the CPU to eliminate the performance cost of loading PTEs from memory during every address translation.
- The
satpregister holds the physical address of the root page-table page, telling the CPU which page table tree to use for the currently executing thread.
- PTE flags control access permissions:
PTE_V: Indicates the PTE is present and valid.PTE_R,PTE_W,PTE_X: Control read, write, and execute permissions, respectively.PTE_U: Allows access by instructions executing in user mode.- Note:
PTE_Uis per-page, not per-page-table; xv6 maps pages like the trapframe in a user process’s page table so trap entry/return code can access them, but clearsPTE_Uso user-mode code cannot read or modify kernel-owned state.
Kernel Address Space
- xv6 uses one page table per process for user space and one shared page table for the kernel.
- The kernel page table gives predictable virtual addresses for RAM and memory-mapped devices.
On the left, xv6’s kernel address space. RWX refer to PTE read, write, and execute permissions. On the right, the RISC-V physical address space that xv6 expects to see:

- Direct mapping architecture:
- Most physical memory and device registers are mapped at virtual addresses exactly equal to their physical addresses.
- The kernel binary is located at
KERNBASE(0x80000000) in both virtual and physical memory spaces. - In QEMU, RAM starts at
0x80000000and extends at least to0x86400000(PHYSTOP). - Memory-mapped device registers sit below
0x80000000in physical address space. - Direct mapping lets the kernel use physical addresses directly, which simplifies operations such as copying pages during
fork.
- Exceptions to direct mapping:
- Trampoline page: mapped twice, once via direct mapping and once at the top of the virtual address space.
- Kernel stacks: each process has a private kernel stack mapped high in memory.
- An unmapped guard page below each kernel stack catches overflow (
PTE_Vclear).
- Kernel-space permissions:
- The trampoline page and kernel text are mapped with
PTE_R | PTE_X. - Other kernel memory is mapped with
PTE_R | PTE_W. - Guard pages are invalid.
- The trampoline page and kernel text are mapped with
Page Table Management Code
- The central data structure for software page table manipulation is
pagetable_t, a C pointer to a RISC-V root page-table page. - Core virtual memory lookup functions:
walk: Mimics the hardware’s 3-level traversal, using 9 bits at a time to descend the tree and return the address of the lowest-level PTE. It can dynamically allocate intermediate pages if requested during the traversal.mappages: Installs PTEs for a virtual-to-physical address range by callingwalkfor each page interval and configuring the PPN and permission flags.
- Kernel initialization routines:
kvminitcreates the kernel page table during early boot, mapping the kernel instructions, data, physical memory up toPHYSTOP, and device memory.kvminithartwrites the root page table physical address into the CPU’ssatpregister to enable hardware address translation.- The
sfence.vmainstruction is executed immediately aftersatpis modified to flush the CPU’s TLB, preventing stale cached mappings from causing invalid memory accesses.
Physical Memory Allocation
- The kernel manages physical memory between the end of the kernel binary and
PHYSTOPas a global pool for run-time allocation. - Memory is allocated and freed strictly in 4096-byte page increments.
- Free pages are tracked using a linked list threaded directly through the available memory pages themselves.
- Allocator implementation:
- Each free page stores a
struct runstructure containing a pointer to the next free page. - The
kfreefunction fills freed memory with the garbage value1to expose dangling references quickly, then prepends the page to the free list. - The
kallocfunction removes and returns the first element from the free list when memory is requested. - The free list structure is protected by a spin lock to handle concurrent allocation requests across multiple CPUs.
- Each free page stores a
Process Address Space
- Each process possesses an independent page table, dictating a private address space that maps contiguous virtual addresses starting at zero to potentially non-contiguous physical pages.
- Address space layout:
- Grows upwards to
MAXVA, addressing up to 256 Gigabytes of virtual memory. - Ordered sequentially from zero: user instructions, global variables, user stack, and an expandable heap.
- The trampoline page is mapped at the top of the user address space to facilitate kernel transitions.
- An inaccessible guard page (
PTE_Uflag cleared) sits directly below the user stack to catch stack overflows via hardware page-fault exceptions.
- Grows upwards to
- Dynamic memory allocation (
sbrk):- The
sbrksystem call shrinks or grows a process’s memory. growprocinvokesuvmallocto acquire new physical pages viakallocand maps them usingmappages.uvmdeallocremoves memory by callinguvmunmap, which utilizeswalkto locate PTEs and passes the associated physical addresses back tokfree.- The user page table serves as the definitive kernel record of which physical pages are allocated to a process.
- The
A process’s user address space, with its initial stack:

User virtual memory looks contiguous, but physical pages may be scattered anywhere in RAM.
| Virtual region | Physical backing |
|---|---|
| text | Physical RAM pages allocated from the free-memory pool, then filled from the ELF file. |
| data | Physical RAM pages allocated from the free-memory pool. |
| heap | Physical RAM pages allocated from the free-memory pool as the heap grows. |
| stack | Physical RAM pages allocated from the free-memory pool. |
| guard page | No physical page; the mapping is left invalid. |
| trapframe | One per-process physical page allocated from the free-memory pool. |
| trampoline | One shared kernel code page, mapped into every process. |
ELF Binary Loading
- The
execsystem call replaces an address space’s existing memory image with a new executable stored in the Executable and Linkable Format (ELF). - Initialization and parsing steps:
- Validates the file via a 4-byte magic number (
0x7F 'E' 'L' 'F'). - Allocates a blank page table via
proc_pagetable. - Parses ELF program section headers (
struct proghdr) to determine memory sizing and block alignments. - Allocates contiguous virtual memory per segment with
uvmallocand populates the pages directly from the file vialoadseg.
- Validates the file via a 4-byte magic number (
- Stack setup:
- Allocates a single stack page and a protective inaccessible guard page.
- Copies command-line argument strings and pointers to the top of the newly allocated stack, preparing
argcandargvfor the program’smainfunction.
- Security and commitment:
- Verifies that segment virtual addresses and sizes do not mathematically overflow a 64-bit integer, preventing malicious binaries from tricking the kernel into mapping data over kernel space.
- Retains the old address space until the entire new image is successfully built. If an error occurs during parsing or allocation, the partial new image is freed and
execreturns an error, safely preserving the original process state.
kinit
Sets up the physical page allocator.
kinitinitializeskmem.lock, the spinlock for the allocator freelistfreerange(end, PHYSTOP)adds usable physical pages to the freelistendmarks the first address after the kernel image in memoryPHYSTOPmarks the top of physical memory xv6 is allowed to usefreerangerounds the start address up to the next page boundary- Each 4096-byte page in the range is passed to
kfree
kmem is the allocator’s global state.
kmem.lockprotects allocator state from concurrent CPU accesskmem.freelistpoints to the linked list of currently free physical pages- Every
kallocandkfreecall updateskmem.freelistwhile holdingkmem.lock
The allocator stores free pages in a linked list.

kfree:
- Returns one physical page to the allocator.
- Requires a page-aligned address.
- Rejects addresses below
endor at/abovePHYSTOP. - Invalid pages cause
panic("kfree"). - Fills the page with
1bytes to expose dangling references. - Reuses the page itself as a
struct runfreelist node. - Pushes the page onto
kmem.freelistwhile holdingkmem.lock.
kalloc:
- Removes one physical page from the allocator.
- Pops the current
kmem.freelisthead while holdingkmem.lock. - Returns
0if the freelist is empty. - Fills the allocated page with
5bytes before returning it. - Returns a kernel-usable pointer to the page.
- Callers must not assume allocated pages are zeroed.
After this, the kernel can allocate whole physical pages for page tables, kernel stacks, user memory, pipe buffers, and disk structures.
kvminit
Builds the kernel page table, but does not turn paging on yet.
kvminitsets globalkernel_pagetableto the root page fromkvmmake()- All harts use
kernel_pagetablewhile running kernel code - The root page-table page is cleared with
memset kvmmapcallsmappagesto install page-aligned mappingsmappagesgets each level-0 PTE withwalk(..., alloc=1)and writesPA | perm | PTE_Vmappagespanics on remap and returns-1on allocation failure- RISC-V Sv39 page tables have three levels, so a single virtual address may require multiple page-table pages
Most mappings are direct: virtual address equals physical address.
| Virtual | Physical | Pages | Perm | Purpose |
|---|---|---|---|---|
UART0 | UART0 | 1 | R W | Console UART registers |
VIRTIO0 | VIRTIO0 | 1 | R W | Disk MMIO registers |
PLIC range | Same | 0x4000000 / PGSIZE | R W | Interrupt controller |
KERNBASE to etext | Same | (etext - KERNBASE) / PGSIZE | R X | Kernel text |
etext to PHYSTOP | Same | (PHYSTOP - etext) / PGSIZE | R W | Kernel data and RAM |
TRAMPOLINE | trampoline.S | 1 | R X | Trap entry and return |
KSTACK(p) | New page | 1 per process slot | R W | Process kernel stack |
| Stack guard page | Unmapped | 1 per process slot | None | Overflow trap |
TRAMPOLINE
- is not direct-mapped.
- a high virtual address that points to the physical page containing trampoline code.
- xv6 uses the same high virtual address in both kernel and user page tables, so trap entry and trap return can run while switching page tables.
KSTACK(p)
- gives each process slot a kernel stack.
- is used when that process is running in the kernel after a syscall, interrupt, or exception.
- a guard page is left unmapped next to each stack, so stack overflow causes a fault instead of overwriting another stack.
Page-table mechanics:
walk
- Follows the Sv39 page-table tree for one virtual address.
- Returns the level-0 PTE for that address.
- Allocates missing intermediate page-table pages when
allocis set.
mappages
- Creates PTEs for a virtual-to-physical range.
- Uses
walkto find each level-0 PTE. - Writes physical address, permission bits, and
PTE_Vinto each PTE.
walkaddr
- Translates a user virtual address to a physical address.
- Requires the page to be valid and user-accessible.
- Returns
0if the address is invalid or unmapped.
uvmunmap
- Removes mappings from a virtual range.
- Can also free the mapped physical pages.
- Leaves page-table pages themselves for
freewalk.
freewalk
- Recursively frees page-table pages.
- Assumes all leaf mappings are already removed.
- Panics if it finds a still-mapped leaf PTE.
ismapped
- Checks whether a virtual address has a valid PTE.
- Used before allocating lazy pages.
User address spaces:
uvmcreate: allocates an empty root page table for a process.uvmalloc: allocates physical pages and maps them as user memory grows.uvmdealloc: shrinks user memory by unmapping pages above the new size.uvmfree: frees mapped user memory, then frees the page-table tree.uvmcopy: copies mapped pages from parent to child duringfork.uvmclear: clearsPTE_Uon one page so user code cannot access it.
User/kernel boundary:
copyin: copies bytes from user virtual memory into kernel memory.copyout: copies bytes from kernel memory into user virtual memory.copyinstr: copies a null-terminated user string into kernel memory.vmfault: allocates and maps a missing lazy-allocation page on demand.
After this, kernel_pagetable describes the kernel virtual address space. The CPU still uses physical addresses until kvminithart writes this page table into satp.
kvminithart
Turns on paging for the current hart.
kvminitbuildskernel_pagetableonce- CPU 0 calls it after
kvminit - Secondary harts call it after CPU 0 finishes global initialization
- Every hart has its own
satpregister, so every hart must run this setup - The current hart runs
sfence_vmabefore changing page tables MAKE_SATP(kernel_pagetable)prepares thesatpvaluew_satp(...)writes that value intosatp- The current hart runs
sfence_vmaagain after the switch
satp is the supervisor page-table register.
satptells this hart which page table to use for address translation.MAKE_SATP(pagetable)selects Sv39 mode and stores the root page-table physical page number.w_satp(...)writes the prepared value into the hart’ssatpregister.- After this write, addresses are translated through
kernel_pagetable.
sfence_vma flushes address-translation state.
- Flushes stale TLB entries on the current hart.
sfence.vma zero, zeromeans flush all address translations.- The first call orders earlier page-table writes before the switch.
- The second call removes translations cached before the switch.
After this, the hart uses the kernel virtual address space built by kvminit. Direct-mapped kernel addresses keep working because most kernel virtual addresses equal their physical addresses.
Memory Subsystem
Overall Memory Architecture
--- config: layout: dagre --- flowchart LR BOOT["boot / kernel VM setup<br><br>kinit<br>kvminit<br>proc_mapstacks<br>kvminithart"] -- initializes --> PMA["physical page allocator<br><br>kalloc<br>kfree<br>kmem.freelist"] BOOT -- builds and enables --> KVM["kernel virtual memory<br><br>kernel_pagetable<br>direct map<br>kernel stacks<br>trampoline"] PMA -- "supplies 4096-byte pages to" --> KVM & PROCVM["process virtual memory<br><br>p->pagetable<br>kexec<br>kfork<br>sys_sbrk / growproc<br>p->sz"] PROCVM -- uses --> VMAPI["VM helper functions<br><br>walk<br>mappages<br>uvmalloc<br>uvmunmap<br>copyin / copyout<br>vmfault"] VMAPI -- creates / edits mappings to --> PHYS["physical memory + MMIO<br><br>RAM<br>page-table pages<br>user pages<br>trapframe page<br>trampoline page<br>UART / VIRTIO / PLIC"] HW["RISC-V translation backend<br><br>satp<br>TLB<br>Sv39 hardware walker<br>page fault"] -- uses active page table to access --> PHYS KVM -- selected in kernel mode by satp --> HW PROCVM -- selected in user mode by satp --> HW BOOT:::iface PMA:::iface KVM:::source PROCVM:::source VMAPI:::iface HW:::backend PHYS:::source classDef process fill:#F3EFE2,stroke:#111,stroke-width:2px,color:#111 classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111
Physical Page Allocator
--- config: layout: dagre --- flowchart LR subgraph INIT["allocator initialization"] direction TB KINIT["kinit<br><br>starts physical allocator setup"] FREERANGE["freerange<br><br>adds pages from end to PHYSTOP"] end subgraph STATE["allocator state"] direction TB KMEM["kmem<br><br>spinlock lock<br>freelist"] RUN["struct run<br><br>next pointer stored inside free page"] FREEPAGES["free 4096-byte physical pages<br><br>user pages<br>page-table pages<br>kernel stacks<br>trapframes<br>pipe buffers"] end subgraph API["allocator functions"] direction TB KFREE["kfree<br><br>returns a physical page to freelist"] KALLOC["kalloc<br><br>removes one physical page from freelist"] end KINIT --> FREERANGE FREERANGE -- calls repeatedly --> KFREE KFREE -- protects and updates --> KMEM KMEM -- freelist nodes are --> RUN RUN -- each node represents one --> FREEPAGES KALLOC -- protects and updates --> KMEM KMEM -- hands out --> KALLOC KALLOC -- returns page to --> USERS["main users of kalloc<br><br>kernel page table<br>user memory<br>page-table pages<br>kernel stacks<br>trapframes<br>pipe buffers"] KINIT:::iface FREERANGE:::iface KMEM:::source RUN:::file FREEPAGES:::source KFREE:::iface KALLOC:::iface USERS:::process classDef process fill:#F3EFE2,stroke:#111,stroke-width:2px,color:#111 classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111
Kernel Virtual Memory
--- config: layout: dagre --- flowchart LR classDef process fill:#F3EFE2,stroke:#111,stroke-width:2px,color:#111 classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111 subgraph BUILD["kernel VM construction: software builds mappings"] direction TB KVMINIT["kvminit<br/><br/>starts kernel VM setup"]:::iface KVMMAKE["kvmmake<br/><br/>allocates root page table<br/>creates kernel mappings"]:::iface KVMMAP["kvmmap<br/><br/>kernel wrapper for mapping ranges"]:::iface MAPPAGES["mappages<br/><br/>install VA to PA mappings"]:::iface WALK["walk<br/><br/>software page-table walk<br/>finds or creates lower-level tables"]:::iface MAPSTACKS["proc_mapstacks<br/><br/>maps per-process kernel stacks<br/>leaves guard pages invalid"]:::iface KVMHART["kvminithart<br/><br/>sfence.vma<br/>w_satp(MAKE_SATP(kernel_pagetable))<br/>sfence.vma"]:::iface end subgraph KSTATE["kernel page-table state"] direction TB KPAGETABLE["kernel_pagetable<br/><br/>shared kernel address space"]:::source KPTPAGE["kernel page-table pages<br/><br/>4096-byte pages<br/>512 PTEs each"]:::file KPTE["kernel PTEs<br/><br/>PTE_V<br/>PTE_R / PTE_W / PTE_X<br/>normally no PTE_U"]:::file end subgraph VAMAP["virtual to physical address map"] direction TB subgraph DIRECT_PAIR["direct map"] direction LR DIRECTMAP["direct map region<br/><br/>kernel VA = physical address<br/>RAM + device MMIO<br/>permissions split by region"]:::source RAM["physical RAM<br/><br/>KERNBASE = 0x80000000<br/>PHYSTOP = KERNBASE + 128MB"]:::source MMIO["memory-mapped device registers<br/><br/>UART0<br/>VIRTIO0<br/>PLIC"]:::backend end subgraph TEXT_PAIR["kernel text map"] direction LR KTEXT["kernel text<br/><br/>readable + executable<br/>PTE_R | PTE_X"]:::source KERNELIMG["kernel image in RAM<br/><br/>entry.S<br/>kernel text<br/>kernel data<br/>end symbol"]:::source end subgraph DATA_PAIR["kernel data map"] direction LR KDATA["kernel data + usable RAM<br/><br/>readable + writable<br/>PTE_R | PTE_W"]:::source RAMDATA["physical RAM<br/><br/>usable RAM region"]:::source end subgraph STACK_PAIR["kernel stack map"] direction LR KSTACKS["kernel stacks<br/><br/>one stack per process<br/>mapped high<br/>invalid guard page below"]:::source STACKPAGES["kernel stack physical pages<br/><br/>allocated by kalloc"]:::source end subgraph TRAMP_PAIR["trampoline map"] direction LR KTRAMP["TRAMPOLINE<br/><br/>trap entry / return code<br/>same VA in kernel and user page tables"]:::source TRAMPPAGE["trampoline physical page<br/><br/>trampoline.S code"]:::file end PGTBLPAGES["kernel page-table physical pages<br/><br/>allocated by kalloc"]:::source end subgraph ALLOC["physical page allocator functions"] direction TB KALLOC["kalloc / kfree<br/><br/>allocate or release 4096-byte physical pages"]:::iface end subgraph RUNTIME["kernel-mode runtime: hardware uses mappings"] direction TB KERNELCODE["kernel code<br/><br/>load / store / fetch<br/>using kernel virtual addresses"]:::process CPU["RISC-V CPU"]:::process SATP["satp CSR<br/><br/>active root page table"]:::source TLB["TLB<br/><br/>cached VA to PA translations"]:::backend HWALKER["Sv39 hardware page-table walker<br/><br/>walks kernel_pagetable on TLB miss"]:::backend ACCESS["physical memory / MMIO access"]:::source FAULT["kernel page fault<br/><br/>invalid mapping or bad permission"]:::backend end KVMINIT --> KVMMAKE KVMMAKE -->|"allocates root using"| KALLOC KVMMAKE --> KVMMAP KVMMAKE --> MAPSTACKS KVMMAP --> MAPPAGES MAPPAGES --> WALK WALK -->|"writes / finds PTEs in"| KPAGETABLE WALK -->|"may allocate lower-level tables via"| KALLOC KALLOC -->|"returns"| PGTBLPAGES MAPSTACKS -->|"allocates stack pages via"| KALLOC MAPSTACKS -->|"uses"| KVMMAP KVMHART -->|"loads root into"| SATP SATP -->|"selects"| KPAGETABLE KPAGETABLE -->|"contains"| KPTPAGE KPTPAGE -->|"contains"| KPTE KPAGETABLE -->|"describes"| DIRECTMAP KPAGETABLE -->|"describes"| KTEXT KPAGETABLE -->|"describes"| KDATA KPAGETABLE -->|"describes"| KSTACKS KPAGETABLE -->|"describes"| KTRAMP DIRECTMAP -->|"maps RAM"| RAM DIRECTMAP -->|"maps MMIO"| MMIO KTEXT -->|"maps"| KERNELIMG KDATA -->|"maps"| RAMDATA KSTACKS -->|"maps"| STACKPAGES KTRAMP -->|"maps"| TRAMPPAGE KERNELCODE --> CPU CPU -->|"uses"| SATP CPU -->|"checks first"| TLB TLB -->|"on miss"| HWALKER HWALKER -->|"reads"| KPTPAGE HWALKER -->|"valid PTE"| ACCESS HWALKER -->|"invalid / bad permission"| FAULT
Process Virtual Memory Construction
--- config: layout: dagre --- flowchart LR classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 subgraph CREATE["process VM creation / initialization entry points"] direction TB PROCPGTBL["proc_pagetable<br/><br/>creates per-process user page table"]:::iface EXEC["kexec<br/><br/>builds fresh program image<br/>from ELF file"]:::iface FORK["kfork<br/><br/>creates child process<br/>copies parent address space"]:::iface end subgraph VMAPI["page-table construction"] direction TB UVMCREATE["uvmcreate<br/><br/>create empty user page table"]:::iface UVMALLOC["uvmalloc<br/><br/>allocate and map user memory"]:::iface UVMCOPY["uvmcopy<br/><br/>copy mapped parent pages<br/>skip absent lazy holes"]:::iface LOADSEG["loadseg<br/><br/>load ELF segment bytes<br/>into allocated physical pages"]:::iface FLAGS2PERM["flags2perm<br/><br/>convert ELF flags<br/>into extra PTE_X / PTE_W permissions"]:::iface UVMCLEAR["uvmclear<br/><br/>clear PTE_U<br/>creates inaccessible guard page"]:::iface MAPPAGES["mappages<br/><br/>install VA to PA mappings"]:::iface WALK["walk<br/><br/>software page-table walk<br/>finds or creates PTE location"]:::iface end subgraph USTATE["new process VM state"] direction TB UPAGETABLE["p->pagetable<br/><br/>per-process user address space"]:::source PROCSZ["p->sz<br/><br/>declared user memory size<br/>initialized by exec<br/>copied by fork<br/>later changed by sbrk"]:::source UPTPAGE["user page-table pages<br/><br/>root / intermediate / leaf levels"]:::file UPTE["user PTEs<br/><br/>PTE_V<br/>PTE_R / PTE_W / PTE_X<br/>PTE_U when user-accessible"]:::file end subgraph MAPBOX["virtual to physical address map"] direction TB VAMAP["initial user VA to PA map<br/><br/>text, data, bss, stack, guard page,<br/>TRAPFRAME, and TRAMPOLINE"]:::source end subgraph ALLOC["physical page allocator functions"] direction TB KALLOC["kalloc / kfree<br/><br/>allocate or release 4096-byte physical pages"]:::iface end PROCPGTBL -->|"uses"| UVMCREATE UVMCREATE -->|"allocates root page-table page via"| KALLOC UVMCREATE -->|"creates"| UPAGETABLE PROCPGTBL -->|"describes TRAMPOLINE mapping to trampoline.S page"| VAMAP PROCPGTBL -->|"describes TRAPFRAME mapping to p->trapframe page"| VAMAP PROCPGTBL -->|"installs special mappings using"| MAPPAGES EXEC -->|"creates fresh page table with"| PROCPGTBL EXEC -->|"allocates text, data, bss, and stack pages with"| UVMALLOC EXEC -->|"loads ELF bytes into mapped physical pages with"| LOADSEG EXEC -->|"sets text/data permissions with"| FLAGS2PERM EXEC -->|"describes initial text, data, bss, and stack mappings"| VAMAP EXEC -->|"sets final process size"| PROCSZ EXEC -->|"creates stack guard page with"| UVMCLEAR FORK -->|"creates child page table with"| PROCPGTBL FORK -->|"copies parent mapped pages through"| UVMCOPY FORK -->|"copies parent logical size into child"| PROCSZ UVMCOPY -->|"allocates child physical pages via"| KALLOC UVMCOPY -->|"copies parent text, data, heap, and stack mappings into child"| VAMAP UVMCOPY -->|"leaves absent lazy heap holes absent; p->sz keeps them logically valid"| PROCSZ UVMCOPY -->|"maps copied pages with"| MAPPAGES UVMALLOC -->|"gets physical pages from"| KALLOC UVMALLOC -->|"adds text, data, bss, or stack mappings to"| VAMAP UVMALLOC -->|"maps allocated pages with"| MAPPAGES LOADSEG -->|"fills physical pages behind text and data mappings"| VAMAP FLAGS2PERM -->|"adds ELF-derived PTE_X / PTE_W bits; uvmalloc adds PTE_R and PTE_U"| VAMAP UVMCLEAR -->|"describes stack guard page by clearing PTE_U"| VAMAP MAPPAGES -->|"uses"| WALK WALK -->|"writes or finds PTEs in"| UPAGETABLE WALK -->|"may allocate page-table pages via"| KALLOC WALK -->|"creates PTE path for this VA-to-PA map"| VAMAP UPAGETABLE -->|"contains"| UPTPAGE UPTPAGE -->|"contains"| UPTE UPTE -->|"leaf PTEs describe user VA to physical page mappings"| VAMAP UPTE -->|"non-leaf PTEs point to lower-level page-table pages"| UPTPAGE UPAGETABLE -->|"describes complete initial user virtual-to-physical map"| VAMAP KALLOC -->|"allocates physical backing pages for mapped user pages"| VAMAP KALLOC -->|"allocates user page-table pages"| UPTPAGE
User Virtual Address Translation
--- config: layout: dagre --- flowchart LR classDef process fill:#F3EFE2,stroke:#111,stroke-width:2px,color:#111 classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111 subgraph USTATE["running user VM state"] direction TB UPAGETABLE["p->pagetable<br/><br/>active user address space"]:::source PROCSZ["p->sz<br/><br/>declared valid user memory size<br/>used by vmfault"]:::source UPTPAGE["user page-table pages<br/><br/>root / intermediate / leaf levels"]:::file UPTE["user PTEs<br/><br/>PTE_V<br/>PTE_R / PTE_W / PTE_X<br/>PTE_U"]:::file end subgraph MAPBOX["virtual to physical address map"] direction TB VAMAP["runtime user VA to PA map<br/><br/>text, data, heap, stack,<br/>valid mappings, and lazy holes"]:::source end subgraph HWPATH["normal user-mode hardware translation"] direction TB USERACCESS["user instruction<br/><br/>load / store / fetch<br/>uses virtual address"]:::process CPU["RISC-V CPU"]:::process SATPUSER["satp CSR<br/><br/>points to p->pagetable<br/>while running user code"]:::source TLB["TLB<br/><br/>cached user VA to PA translations"]:::backend HWALKER["Sv39 hardware page-table walker<br/><br/>walks p->pagetable<br/>on TLB miss"]:::backend PTECHECK["PTE check<br/><br/>valid bit<br/>permission bits<br/>user-access bit"]:::backend ACCESSOK["physical page access<br/><br/>read / write / execute succeeds"]:::source PAGEFAULT["user page fault<br/><br/>invalid PTE<br/>bad permission<br/>missing lazy page"]:::backend end subgraph FAULTPATH["fault resolution / lazy allocation path"] direction TB USERTRAP["usertrap<br/><br/>handles user-mode exception<br/>load/store fault scause 13 or 15"]:::process VMFAULT["vmfault<br/><br/>accepts VA below p->sz<br/>rounds down<br/>requires currently unmapped page"]:::iface KALLOC["kalloc<br/><br/>allocate one 4096-byte physical page"]:::iface ZERO["zero-filled physical page<br/><br/>new backing page for lazy VA"]:::source MAPPAGES["mappages<br/><br/>install new VA to PA mapping"]:::iface WALK["walk<br/><br/>software page-table walk<br/>finds or creates PTE location"]:::iface RESUME["resume user process<br/><br/>faulting instruction can be retried"]:::process VMFAULTFAIL["vmfault returns 0<br/><br/>VA >= p->sz<br/>already mapped page<br/>allocation or mapping failure"]:::backend KILL["kill process<br/><br/>usertrap handles failed user fault"]:::process end subgraph COPYPATH["kernel access to user buffers"] direction TB SYSCALL["system call path<br/><br/>user passes pointer to kernel"]:::process COPY["copyin / copyout<br/><br/>safe kernel-user copy helpers<br/>may allocate lazy pages"]:::iface COPYINSTR["copyinstr<br/><br/>copy null-terminated user string<br/>does not allocate lazy pages"]:::iface WALKADDR["walkaddr<br/><br/>copyin / copyout translation<br/>requires valid PTE and PTE_U"]:::iface WALKADDRSTR["walkaddr<br/><br/>copyinstr translation<br/>requires valid PTE and PTE_U"]:::iface WRITECHK["copyout write check<br/><br/>requires PTE_W<br/>rejects read-only user text"]:::backend MEMMOVE["copy bytes after VA becomes PA<br/><br/>copyin / copyout use memmove<br/>copyinstr uses byte loop"]:::process COPYFAIL["copy helper returns -1<br/><br/>missing page<br/>copyout non-writable PTE<br/>invalid address"]:::backend end HEAPPOLICY["lazy heap holes are created by sbrk policy<br/><br/>detailed in heap allocation and growth"]:::iface UPAGETABLE -->|"contains"| UPTPAGE UPTPAGE -->|"contains"| UPTE UPTE -->|"leaf PTEs describe valid text, data, heap, and stack mappings"| VAMAP UPTE -->|"missing or invalid leaf PTEs mark lazy holes that may fault"| VAMAP UPAGETABLE -->|"describes complete runtime user VA-to-PA map"| VAMAP HEAPPOLICY -->|"creates logical heap range with missing PTEs"| VAMAP HEAPPOLICY -->|"updates valid boundary"| PROCSZ USERACCESS --> CPU CPU -->|"uses active root from"| SATPUSER SATPUSER -->|"selects"| UPAGETABLE CPU -->|"checks cached translation in"| TLB TLB -->|"cache hit gives PA from"| VAMAP TLB -->|"on TLB miss"| HWALKER HWALKER -->|"reads page-table pages from"| UPTPAGE HWALKER --> PTECHECK PTECHECK -->|"valid and permitted PTE translates VA through"| VAMAP PTECHECK -->|"access allowed"| ACCESSOK ACCESSOK -->|"accesses physical backing described by"| VAMAP PTECHECK -->|"invalid PTE or bad permission"| PAGEFAULT PAGEFAULT --> USERTRAP USERTRAP -->|"load/store fault calls"| VMFAULT VMFAULT -->|"checks faulting VA against"| PROCSZ VMFAULT -->|"VA below p->sz and unmapped"| KALLOC VMFAULT -->|"VA >= p->sz or already mapped"| VMFAULTFAIL KALLOC -->|"allocation fails"| VMFAULTFAIL KALLOC -->|"allocation succeeds"| ZERO ZERO -->|"new physical backing page for"| VAMAP ZERO -->|"mapped by"| MAPPAGES MAPPAGES -->|"uses"| WALK WALK -->|"writes or finds PTE in"| UPAGETABLE WALK -->|"updates user PTEs"| UPTE MAPPAGES -->|"installs new valid VA-to-PA mapping into"| VAMAP MAPPAGES -->|"mapping failure"| VMFAULTFAIL MAPPAGES -->|"usertrap path"| RESUME MAPPAGES -->|"vmfault returns new PA to copyin path"| MEMMOVE MAPPAGES -->|"vmfault returns new PA before copyout PTE check"| WRITECHK VMFAULTFAIL -->|"usertrap path"| KILL VMFAULTFAIL -->|"copyin / copyout path"| COPYFAIL SYSCALL --> COPY SYSCALL --> COPYINSTR COPY -->|"translates user VA using"| WALKADDR COPYINSTR -->|"translates user VA using"| WALKADDRSTR WALKADDR -->|"walks"| UPAGETABLE WALKADDR -->|"reads"| UPTE WALKADDR -->|"valid mapping returns physical address from"| VAMAP WALKADDR -->|"missing page returns 0; copyin / copyout then call"| VMFAULT WALKADDRSTR -->|"walks"| UPAGETABLE WALKADDRSTR -->|"reads"| UPTE WALKADDRSTR -->|"valid mapping returns physical address from"| VAMAP WALKADDRSTR -->|"missing page makes copyinstr fail"| COPYFAIL COPY -->|"copyout verifies destination PTE"| WRITECHK WRITECHK -->|"writable mapping"| MEMMOVE WRITECHK -->|"not writable"| COPYFAIL COPY -->|"copyin copies user bytes through"| MEMMOVE COPYINSTR -->|"valid mapped string page copies through"| MEMMOVE MEMMOVE -->|"reads or writes physical backing described by"| VAMAP
Heap Allocation and Growth
flowchart LR classDef process fill:#F3EFE2,stroke:#111,stroke-width:2px,color:#111 classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111 subgraph HEAPSTATE["heap-related process VM state"] direction TB UPAGETABLE["p->pagetable<br/><br/>user page table edited by heap operations"]:::source PROCSZ["p->sz<br/><br/>declared user memory size<br/>heap validity boundary"]:::source UPTE["heap mapping state<br/><br/>valid PTE = mapped heap page<br/>missing or invalid PTE = lazy heap hole"]:::file end subgraph MAPBOX["virtual to physical address map"] direction TB VAMAP["heap VA-to-PA map<br/><br/>old heap mappings,<br/>new eager mappings,<br/>lazy unmapped heap range,<br/>and removed mappings after shrink"]:::source end subgraph SBRKPATH["heap growth / shrink request"] direction TB SBRK["sys_sbrk<br/><br/>user asks to grow or shrink heap"]:::iface GROWPROC["growproc<br/><br/>handles eager growth<br/>and all shrinking"]:::iface EAGER["eager positive growth<br/><br/>allocate and map pages immediately"]:::iface LAZY["lazy non-negative growth<br/><br/>sys_sbrk increases p->sz only<br/>do not allocate pages yet"]:::iface SHRINK["negative growth<br/><br/>remove mappings above new size"]:::iface NOCHANGE["zero eager change<br/><br/>growproc leaves p->sz unchanged"]:::iface end subgraph SWMAP["software mapping / unmapping helpers"] direction TB UVMALLOC["uvmalloc<br/><br/>allocate and map heap pages"]:::iface UVMDEALLOC["uvmdealloc<br/><br/>reduce user memory range"]:::iface UVMUNMAP["uvmunmap<br/><br/>remove heap mappings<br/>optionally free physical pages"]:::iface MAPPAGES["mappages<br/><br/>install heap VA to PA mapping"]:::iface WALK["walk<br/><br/>software page-table walk<br/>finds or creates PTE location"]:::iface end subgraph ALLOC["physical page allocator functions"] direction TB KALLOC["kalloc<br/><br/>allocate 4096-byte physical page"]:::iface KFREE["kfree<br/><br/>return physical page to allocator"]:::iface HEAPPAGE["heap physical page<br/><br/>backing page for eager heap mapping"]:::source PGTBLPAGE["page-table physical page<br/><br/>allocated if walk needs lower-level table"]:::source end FAULTREF["first access to lazy heap hole<br/><br/>fault resolution continues in user VA translation"]:::backend UPAGETABLE -->|"contains heap-related PTEs"| UPTE UPTE -->|"describes mapped heap pages; missing or invalid PTEs mark lazy holes"| VAMAP SBRK -->|"t == SBRK_EAGER or n < 0"| GROWPROC SBRK -->|"t != SBRK_EAGER and n >= 0"| LAZY GROWPROC -->|"n > 0"| EAGER GROWPROC -->|"n < 0"| SHRINK GROWPROC -->|"n == 0"| NOCHANGE EAGER --> UVMALLOC UVMALLOC -->|"requests physical heap pages from"| KALLOC KALLOC -->|"returns"| HEAPPAGE UVMALLOC -->|"maps new heap pages with"| MAPPAGES MAPPAGES -->|"uses"| WALK WALK -->|"writes or finds heap PTEs in"| UPAGETABLE WALK -->|"updates"| UPTE WALK -->|"may allocate lower-level page table via"| KALLOC KALLOC -->|"may return"| PGTBLPAGE MAPPAGES -->|"adds immediate heap VA-to-PA mapping into"| VAMAP HEAPPAGE -->|"becomes physical backing for eager heap VA"| VAMAP LAZY -->|"after overflow and TRAPFRAME checks, adjusts"| PROCSZ LAZY -->|"adds logical heap VA range without physical backing"| VAMAP LAZY -->|"leaves PTE invalid or absent"| UPTE VAMAP -->|"later user load/store to lazy hole causes"| FAULTREF SHRINK --> UVMDEALLOC UVMDEALLOC -->|"uses"| UVMUNMAP UVMUNMAP -->|"uses"| WALK WALK -->|"finds existing heap PTEs in"| UPAGETABLE UVMUNMAP -->|"removes heap VA-to-PA mappings from"| VAMAP UVMUNMAP -->|"clears valid heap PTEs; skips missing or invalid PTEs"| UPTE UVMUNMAP -->|"frees mapped heap pages through"| KFREE KFREE -->|"receives old heap physical pages from"| VAMAP SHRINK -->|"sets p->sz to uvmdealloc result"| PROCSZ
Process VM Cleanup
--- config: layout: dagre --- flowchart LR classDef process fill:#F3EFE2,stroke:#111,stroke-width:2px,color:#111 classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111 classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111 classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111 classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111 subgraph EXITENTRY["process VM cleanup entry"] direction TB EXIT["exec replaces old image<br/>or freeproc destroys process"]:::process FREEPROC["freeproc<br/><br/>process destruction only<br/>frees p->trapframe separately"]:::iface FREEPGTBL["proc_freepagetable<br/><br/>unmap special mappings<br/>free user memory<br/>free page-table pages"]:::iface end subgraph OLDSTATE["old process VM state"] direction TB UPAGETABLE["old p->pagetable<br/><br/>address space being destroyed"]:::source PROCSZ["old p->sz<br/><br/>size used to know user memory range"]:::source UPTPAGE["old user page-table pages<br/><br/>root / intermediate / leaf levels"]:::file UPTE["old user PTEs<br/><br/>leaf PTEs map physical pages<br/>non-leaf PTEs point to lower tables"]:::file end subgraph MAPBOX["virtual to physical address map"] direction TB VAMAP["old user VA to PA map<br/><br/>text, data, heap, stack,<br/>TRAPFRAME, TRAMPOLINE,<br/>and mapped physical pages"]:::source end subgraph FREECHAIN["software freeing chain"] direction TB UVMFREE["uvmfree<br/><br/>free user memory<br/>then free page-table pages"]:::iface UVMUNMAP["uvmunmap<br/><br/>remove leaf mappings<br/>optionally free physical pages"]:::iface FREEWALK["freewalk<br/><br/>recursively free page-table pages<br/>after leaf mappings are gone"]:::iface WALK["walk<br/><br/>find PTEs during unmapping"]:::iface end subgraph SPECIAL["special mapping cleanup"] direction TB UNMAPTRAMP["unmap TRAMPOLINE<br/><br/>remove special mapping<br/>without freeing shared trampoline code"]:::iface UNMAPTRAPFRAME["unmap TRAPFRAME<br/><br/>remove special mapping only<br/>do not free trapframe page here"]:::iface end subgraph ALLOC["physical page allocator functions"] direction TB KFREE["kfree<br/><br/>return 4096-byte physical page<br/>to allocator freelist"]:::iface USERPAGES["old user physical pages<br/><br/>text / data / heap / stack"]:::source PGTBLPAGES["old page-table physical pages<br/><br/>root / intermediate / leaf table pages"]:::source TRAPFRAMEPAGE["old trapframe physical page<br/><br/>per-process page<br/>freed by freeproc, not exec cleanup"]:::file TRAMPPAGE["trampoline physical page<br/><br/>shared trampoline.S code<br/>not freed here"]:::file end EXIT -->|"process destruction only"| FREEPROC EXIT -->|"old address-space cleanup"| FREEPGTBL FREEPROC -->|"then calls proc_freepagetable if p->pagetable exists"| FREEPGTBL FREEPGTBL -->|"uses old size"| PROCSZ FREEPGTBL -->|"starts cleanup of"| UPAGETABLE UPAGETABLE -->|"contains"| UPTPAGE UPTPAGE -->|"contains"| UPTE UPTE -->|"leaf PTEs describe old user VA-to-PA mappings"| VAMAP UPAGETABLE -->|"describes old complete map"| VAMAP FREEPGTBL -->|"unmaps special trampoline mapping"| UNMAPTRAMP FREEPGTBL -->|"unmaps trapframe mapping without freeing page"| UNMAPTRAPFRAME FREEPGTBL --> UVMFREE UVMFREE -->|"free mapped user memory first"| UVMUNMAP UVMUNMAP -->|"uses"| WALK WALK -->|"finds leaf PTEs in"| UPAGETABLE UVMUNMAP -->|"removes text/data/heap/stack mappings from"| VAMAP UVMUNMAP -->|"returns mapped physical pages through"| KFREE KFREE -->|"receives"| USERPAGES FREEPROC -->|"returns p->trapframe page through"| KFREE KFREE -->|"receives"| TRAPFRAMEPAGE UNMAPTRAPFRAME -->|"removes mapping to per-process trapframe page"| VAMAP UNMAPTRAMP -->|"removes mapping to shared trampoline page"| VAMAP UNMAPTRAMP -->|"does not free shared page"| TRAMPPAGE UVMFREE -->|"after leaf mappings are gone"| FREEWALK FREEWALK -->|"recursively releases page-table pages from"| UPTPAGE FREEWALK -->|"returns page-table pages through"| KFREE KFREE -->|"receives"| PGTBLPAGES