Traps

Three distinct events force a CPU to suspend ordinary instruction execution and transfer control to specialized handler code:

  • system calls initiated by the ecall instruction,
  • exceptions triggered by illegal operations (such as division by zero or invalid virtual addresses), and
  • device interrupts signaling hardware needs. These events, collectively referred to as traps, must be handled transparently so the interrupted code can resume without disruption. Complete isolation is maintained by handling all traps exclusively in kernel space. The trap handling lifecycle consists of four stages:
  1. hardware actions by the RISC-V CPU,
  2. assembly instructions to save state,
  3. a C function to determine the trap’s cause,
  4. and the specific service routine.

RISC-V Trap Machinery

The RISC-V hardware dictates trap behavior through supervisor-mode control registers, which are inaccessible to user mode:

  • stvec: Stores the memory address of the kernel’s trap handler (virtual address).
  • sepc: Captures the program counter at the exact moment the trap occurs. The sret instruction later copies this value back to the program counter to resume execution.
  • scause: Stores a numeric code indicating the reason for the trap.
  • sscratch: Provides temporary storage crucial for the very first instructions of the trap handler.
  • sstatus: Contains the SIE bit, which controls whether device interrupts are deferred, and the SPP bit, which records whether the trap originated in user or supervisor mode.

The registers relate to traps handled in supervisor mode, and they cannot be read or written in user mode. There is a similar set of control registers for traps handled in machine mode; xv6 uses them only for the special case of timer interrupts.

Each CPU on a multi-core chip has its own set of these registers, and more than one CPU may be handling a trap at any given time.

When forcing a trap (excluding timer interrupts), the hardware executes a strict sequence of operations:

Note: In xv6, timer interrupts first enter machine mode and are then forwarded to supervisor mode as a software interrupt.

  1. Aborts the trap if it is a device interrupt and the SIE bit is clear.
  2. Disables further interrupts by clearing SIE.
  3. Copies the current program counter to sepc.
  4. Saves the current execution mode into the SPP bit.
  5. Writes the trap cause into scause.
  6. Elevates the execution mode to supervisor mode.
  7. Copies the handler address from stvec to the program counter.
  8. Resumes execution at the new instruction address.

The CPU intentionally minimizes its hardware operations; it does not switch page tables, switch to a kernel stack, or save general-purpose registers. This minimal hardware intervention preserves flexibility and prevents security vulnerabilities, such as a malicious application directing the kernel entry point.

Traps from User Space

Xv6 handles traps differently depending on whether they come from user space or kernel space.

From user space, a trap may be caused by:

  • ecall,
  • an exception,
  • or a device interrupt.

The path is uservec -> usertrap -> usertrapret -> userret.

When a trap occurs in user space, the active page table is still the user page table, since RISC-V does not switch page tables on trap entry. Thus:

  • the trap handler address in stvec must have a valid mapping in the user page table.
  • xv6’s trap handling code needs to switch to the kernel page table.
  • in order to be able to continue executing after that switch, the kernel page table must also have a mapping for the handler pointed to by stvec.
  • xv6 satisfies these requirements using a trampoline page.
  • xv6 sets stvec to uservec on the trampoline page, mapped at TRAMPOLINE in both the user and kernel page tables.
  • The trampoline mapping is identical in both page tables, so trap handling can continue after switching satp.

The user space trap sequence flows through four primary stages:

  • Assembly Entry (uservec):
    • Before returning to user space, the kernel stores the process’s TRAPFRAME address in sscratch.
    • Because all 32 general-purpose registers belong to the interrupted user code, uservec starts by executing csrrw to swap a0 with sscratch.
    • a0 now holds a pointer to the process’s trapframe, mapped at TRAPFRAME just below TRAMPOLINE.
    • uservec saves all 32 user registers into the trapframe, which has space reserved for them.
    • The kernel also keeps a physical pointer to the same page in p->trapframe.
    • It extracts the kernel stack pointer, hartid, usertrap function address, and kernel page table address from the trapframe.
    • It updates satp to the kernel page table and jumps to the usertrap C function.
  • C Handler (usertrap):
    • Updates stvec to point to kernelvec, ensuring that any traps occurring during kernel execution are routed correctly.
    • Trap entry clears SIE, but xv6 later re-enables interrupts in selected kernel paths, especially before running syscall code, so device and timer interrupts can still be handled while the kernel is executing.
    • Saves sepc into the trapframe, since usertrap may yield and another process may run before this one resumes.
    • Identifies the trap cause and routes it:
      • invokes syscall for system calls,
      • devintr for device interrupts, or
      • kills the process for illegal exceptions (in basic xv6, all user page faults are treated as illegal exceptions).
    • If handling a system call, it increments the saved sepc by 4, ensuring the process resumes at the instruction immediately following the ecall.
    • On the way out, usertrap checks whether the process was killed and yields on a timer interrupt.
  • C Return Preparation (usertrapret):
    • Prepares the control registers for a future user trap by pointing stvec back to uservec.
    • Populates the trapframe fields required by uservec and sets sepc to the saved user program counter.
    • Calls userret on the trampoline page, passing the TRAPFRAME address and the user page table pointer.
  • Assembly Exit (userret):
    • Switches satp back to the user page table.
    • After that switch, it can rely only on registers and the trapframe, since ordinary kernel mappings are gone.
    • Restores the 32 user registers from the trapframe, performs a final swap of a0 and sscratch to restore the user’s a0, and executes sret to re-enter user mode.

The most common deliberate trap from user space is a system call, which utilizes the trapframe infrastructure to pass instructions and data securely to the kernel.

System Call Mechanisms

User programs initiate system calls:

  • placing arguments into specific registers (e.g., a0, a1),
  • placing the system call number into a7, and
  • executing ecall.
  • This follows the RISC-V calling convention, so syscall arguments begin in registers.

Once the trap mechanism hands control to the syscall function, the kernel uses the saved a7 value to index into the syscalls array, which acts as a dispatch table mapping numbers to implementation functions.

Upon completion, the system call’s return value is written to p->trapframe->a0, overwriting the first argument so the user code receives the result. By convention, negative numbers indicate errors, while zero or positive numbers indicate success.

System calls must frequently access arguments and memory provided by the user process:

  • The functions argint, argaddr, and argfd extract integers, pointers, and file descriptors from the saved registers in the trapframe; they use argraw to read the raw saved register.
  • Pointer arguments create two problems: they may be invalid or malicious, and they refer to user virtual addresses, not kernel mappings.
  • The kernel uses functions like fetchstr and copyinstr to safely read string data from user space.
  • copyinstr walks the target process’s page table, which is not the current page table.
  • walkaddr translates the user virtual address to a physical address and checks that it belongs to user memory.
  • After translation, direct mapping lets the kernel copy bytes using the corresponding kernel virtual address.
  • copyout performs the reverse direction, copying data from kernel space to a user address.

While the complex trampoline mechanism safely handles transitions from user space, traps that occur while already executing inside the kernel require a much simpler control flow.

Traps from Kernel Space

When the CPU is executing kernel code, stvec points directly to the kernelvec assembly code. Because the trap originates in supervisor mode, the satp register is already pointing to the kernel page table, and the stack pointer is already set to a valid kernel stack.

  • kernelvec pushes all 32 registers directly onto the current kernel stack, safely preserving the state of the interrupted kernel thread.
  • Execution jumps to the kerneltrap C function.
  • kerneltrap handles:
    • device interrupts (devintr), or
    • triggers a kernel panic if an exception occurs, as kernel exceptions are always fatal errors.
  • If the trap is a timer interrupt and a process thread is active, kerneltrap invokes yield to allow other threads CPU time.
  • Because yield may switch threads and overwrite sepc and sstatus, kerneltrap securely saves and restores these hardware registers locally.
  • Control returns to kernelvec, which pops the registers off the stack and executes sret to resume the interrupted kernel code.

Page-Fault Exceptions

xv6’s default response is simple: a user-space exception kills the process, while a kernel-space exception panics the kernel.

Page faults occur when:

  • a virtual address use contains no mapping in the page table,
  • PTE_V is clear, or
  • the access violates permissions such as PTE_R, PTE_W, PTE_X, or PTE_U.

RISC-V distinguishes instruction, load, and store page faults. scause records the type, and stval records the faulting virtual address.

Real kernels use page faults more aggressively.

On Copy-on-write (COW) fork, parent and child initially share physical pages as read-only. A write causes a store page fault; the kernel allocates a new page, copies the old contents, updates the faulting PTE to a private writable page, and resumes. Reference counting decides when shared pages can be freed and avoids copying when a page is no longer shared. This makes fork much cheaper, especially for fork followed by exec.

Lazy allocation: sbrk grows the process size without immediately allocating pages or PTEs. The first access faults, and the kernel allocates and maps the page then. This avoids work for unused pages and spreads allocation cost over time.

Demand paging: exec can install invalid PTEs first and load code/data from disk only on fault. This reduces startup latency for large programs.

Paging to disk: when RAM is scarce, the kernel can evict pages to disk, mark their PTEs invalid, and page them back in on fault. If RAM is full, paging in one page may require evicting another. Paging works best when programs have good locality of reference.

Other page-fault uses include automatic stack growth and memory-mapped files.

Real-World Context

  • The trampoline and trapframe exist because RISC-V does very little on trap entry: it does not switch page tables, save general registers, or identify the current process for the kernel.
  • Thus the first trap-entry instructions must run in supervisor mode but still under the user page table, with user register contents still live.
  • xv6 relies on two protected handoff mechanisms:
    • sscratch to stash the trapframe pointer
    • user-page-table mappings to kernel-owned memory without PTE_U, so user code cannot access them
  • A faster alternative is to map kernel memory into every user page table. That removes the trampoline requirement, avoids switching page tables on user traps, and lets kernel code directly dereference user pointers.
  • Many real systems use that style for efficiency, but xv6 avoids it to reduce security risk from accidental user-pointer use and to avoid extra address-space-overlap complexity.
  • Real kernels also implement COW fork, lazy allocation, demand paging, paging to disk, memory-mapped files, and try to keep nearly all physical memory in use for applications or caches.
  • xv6 is intentionally simpler: if memory runs out, it usually returns an error or kills a process instead of reclaiming memory by evicting another page.

Trap Catalogue

From the ISA perspective:

User syscall

Number: Exception 8
Cause: Environment call from U-mode.

A user program executes ecall, RISC-V records scause = 8, and xv6’s usertrap() recognizes it as a syscall. xv6 then calls syscall() and advances the saved sepc by 4 so the process resumes after the ecall instruction.

User-mode mistakes

Numbers: Exceptions 0–7, 12, 13, 15
Causes:

  • misaligned instruction or data address,
  • access fault,
  • illegal instruction,
  • breakpoint,
  • instruction page fault,
  • load page fault,
  • store page fault.

These are traps caused by bad or unsupported user behavior. xv6 does not try to recover from these in the basic kernel. usertrap() prints diagnostic information and marks the process as killed.

Kernel-mode mistakes

Numbers: Exceptions 0–7, 12, 13, 15, (and 9) Causes:

  • same kinds of exceptions while xv6 is already running in supervisor mode.
  • exception 9 is ecall from S-mode.

These are treated much more seriously. xv6 considers a kernel bug. kerneltrap() panics instead of killing only one process.

Supervisor external interrupts

Number: Interrupt 9
Cause: Supervisor external interrupt.

These are device interrupts delivered to the S-mode kernel, usually through the PLIC (includes devices such as UART and virtio disk). usertrap() or kerneltrap() calls devintr(), and devintr() identifies and handles the device.

Machine timer interrupt (and Supervisor software interrupt)

Numbers: Interrupt 7, then interrupt 1
Causes:

  • machine timer interrupt, then
  • supervisor software interrupt.

This is xv6’s special timer path. The physical timer first causes a machine timer interrupt. xv6’s machine-mode timer code programs the next timer event and forwards the event into supervisor mode using a supervisor software interrupt. Then devintr() handles it as a clock interrupt.

Supervisor timer interrupt

Number: Interrupt 5
Cause: Supervisor timer interrupt.

This exists architecturally in RISC-V. Conceptually, it is a timer interrupt intended directly for S-mode. However, stock xv6’s main timer path is not built around this as the primary event; xv6 uses the machine timer path and forwards to S-mode.

Machine-level interrupts

Numbers: Interrupts 3, 11
Causes:

  • machine software interrupt
  • machine external interrupt.

These are real RISC-V interrupt causes, but they target machine mode. Normal xv6 kernel trap code runs in supervisor mode, so usertrap(), kerneltrap(), and devintr() do not normally receive these as ordinary xv6 traps. They belong to M-mode firmware or machine-mode runtime code.

Custom interrupts

Numbers: Interrupt 13, 16+
Causes:

  • Counter-overflow interrupt,
  • platform or custom interrupts.

These are for performance counters or platform-specific interrupt sources. Stock xv6 does not really use them. A more advanced OS could use counter overflow for profiling, but xv6 keeps interrupt handling minimal.

Custom exceptions

Numbers: Exceptions 10, 14, 16–19, 20+
Causes:

  • reserved,
  • custom,
  • double trap,
  • software check,
  • hardware error.

These are not part of the normal xv6 teaching path. If one somehow occurs from user mode, xv6 would generally treat it as an unexpected user exception and kill the process. If it occurs in kernel mode, xv6 would panic.

Trap from User Space Sequence

sequenceDiagram
  autonumber

  actor U as User code
  participant CPU as RISC-V hardware / CSRs
  participant UV as uservec<br/>(trampoline.S)
  participant TF as p->trapframe
  participant UT as usertrap()<br/>(trap.c)
  participant H as handler logic<br/>syscall / devintr / vmfault
  participant PR as prepare_return()<br/>(trap.c)
  participant UR as userret<br/>(trampoline.S)

  Note over U,CPU: Mode = U<br/>satp = user page table<br/>stvec = TRAMPOLINE + uservec

  U->>CPU: ecall / exception / interrupt

  CPU->>CPU: save trap state<br/>sepc = faulting/interrupted PC<br/>scause = cause<br/>stval = fault address if any
  CPU->>CPU: switch privilege<br/>U mode -> S mode
  CPU->>UV: set PC = stvec<br/>enter uservec

  Note over UV,CPU: Mode = S<br/>satp still = user page table

  UV->>CPU: csrw sscratch, a0<br/>save original user a0 in CSR
  UV->>UV: li a0, TRAPFRAME
  UV->>TF: save user registers except a0
  UV->>CPU: csrr t0, sscratch
  UV->>TF: save original user a0<br/>trapframe->a0 = t0

  UV->>TF: load kernel_sp
  UV->>TF: load kernel_hartid
  UV->>TF: load kernel_trap = usertrap
  UV->>TF: load kernel_satp

  UV->>CPU: sfence.vma
  UV->>CPU: write satp = kernel_satp
  UV->>CPU: sfence.vma
  UV->>UT: jump to usertrap()

  Note over UT,CPU: Mode = S<br/>satp = kernel page table

  UT->>CPU: set stvec = kernelvec
  UT->>CPU: read scause, sepc, stval
  UT->>TF: trapframe->epc = sepc

  alt scause == 8: system call exception
    UT->>TF: trapframe->epc += 4<br/>skip ecall on return
    UT->>CPU: intr_on()
    UT->>H: syscall()

    H->>TF: read syscall number from a7
    H->>TF: read syscall arguments from a0-a5
    H->>H: validate syscall number
    H->>H: lookup syscalls[num]
    H->>H: call selected sys_* handler
    H->>TF: write return value to a0

  else interrupt recognized by devintr()
    UT->>H: devintr()
    H->>CPU: read scause

    alt supervisor external interrupt
      H->>H: plic_claim()
      H->>H: if UART0_IRQ: uartintr()
      H->>H: if VIRTIO0_IRQ: virtio_disk_intr()
      H->>H: plic_complete(irq)
      H-->>UT: return 1<br/>ordinary device interrupt handled
      UT->>UT: continue return path<br/>no yield

    else supervisor timer interrupt
      H->>H: clockintr()
      H->>H: ticks++
      H->>H: wakeup(&ticks)
      H->>CPU: set next stimecmp
      H-->>UT: return 2<br/>timer interrupt handled
      UT->>UT: yield()<br/>later resume in usertrap()

    else not recognized
      H-->>UT: return 0
    end

  else scause == 13 or 15: page fault
    UT->>CPU: read stval<br/>faulting virtual address
    UT->>H: vmfault(pagetable, stval, read/write)

    alt vmfault succeeds
      H->>H: allocate missing lazy page
      H->>H: map page into user pagetable
      H-->>UT: same instruction can retry

    else vmfault fails
      H-->>UT: fatal user page fault<br/>return path omitted
    end

  else unexpected user exception
    UT->>UT: fatal user exception<br/>return path omitted
  end

  UT->>PR: prepare_return()

  PR->>CPU: intr_off()
  PR->>CPU: set stvec = TRAMPOLINE + uservec
  PR->>TF: fill kernel_satp
  PR->>TF: fill kernel_sp
  PR->>TF: fill kernel_trap = usertrap
  PR->>TF: fill kernel_hartid
  PR->>CPU: set sstatus.SPP = user
  PR->>CPU: set sstatus.SPIE = enabled
  PR->>CPU: set sepc = trapframe->epc

  PR-->>UT: return
  UT-->>UR: return user satp in a0

  Note over UR,CPU: Mode = S<br/>satp still = kernel page table

  UR->>CPU: sfence.vma
  UR->>CPU: write satp = user page table
  UR->>CPU: sfence.vma
  UR->>UR: li a0, TRAPFRAME
  UR->>TF: restore user registers except a0
  UR->>TF: restore user a0 from trapframe
  UR->>CPU: sret

  CPU->>U: resume at sepc

  Note over U,CPU: Mode = U<br/>satp = user page table

Trap from Kernel Space Sequence

sequenceDiagram
  autonumber

  actor K as Kernel code
  participant CPU as RISC-V hardware / CSRs
  participant KV as kernelvec<br/>(kernelvec.S)
  participant KT as kerneltrap()<br/>(trap.c)
  participant DI as devintr()<br/>(trap.c)
  participant B as device / timer backend
  participant P as panic / yield

  Note over K,CPU: Mode = S<br/>satp = kernel page table<br/>stvec = kernelvec

  K->>CPU: device interrupt / timer interrupt / kernel exception

  CPU->>CPU: save trap state<br/>sepc = interrupted kernel PC<br/>scause = cause<br/>stval = fault address if any
  CPU->>CPU: stay in supervisor mode
  CPU->>KV: set PC = stvec<br/>enter kernelvec

  Note over KV: Mode = S<br/>satp already = kernel page table<br/>sp already = current kernel stack

  KV->>KV: make stack frame on current kernel stack
  KV->>KV: save caller-saved registers<br/>ra gp tp t0-t2 a0-a7 t3-t6
  KV->>KT: call kerneltrap()

  Note over KT: kerneltrap runs on interrupted kernel stack

  KT->>CPU: read sepc
  KT->>CPU: read sstatus
  KT->>CPU: read scause

  KT->>KT: check SPP == supervisor
  alt trap did not come from supervisor mode
    KT->>P: panic("kerneltrap: not from supervisor mode")
  else came from supervisor mode
    KT->>KT: continue
  end

  KT->>KT: check interrupts are disabled
  alt interrupts are enabled
    KT->>P: panic("kerneltrap: interrupts enabled")
  else interrupts disabled
    KT->>DI: devintr()
  end

  alt supervisor external interrupt
    DI->>B: plic_claim()
    alt UART interrupt
      B->>B: uartintr()
    else virtio disk interrupt
      B->>B: virtio_disk_intr()
    else unknown external irq
      B->>B: print unexpected irq
    end
    B->>B: plic_complete(irq)
    DI-->>KT: return 1

    KT->>KT: ordinary device interrupt<br/>no yield required

  else timer interrupt
    DI->>B: clockintr()
    B->>B: ticks++ on CPU 0<br/>wakeup(&ticks)<br/>set next stimecmp
    DI-->>KT: return 2

    alt myproc() != 0
      KT->>P: yield()
      P-->>KT: later resumes inside kerneltrap()
    else no current process
      KT->>KT: do not yield
    end

  else not a recognized interrupt
    DI-->>KT: return 0
    KT->>KT: print scause, sepc, stval
    KT->>P: panic("kerneltrap")
  end

  KT->>CPU: restore sepc
  KT->>CPU: restore sstatus
  KT-->>KV: return to kernelvec

  KV->>KV: restore caller-saved registers<br/>tp is not restored
  KV->>KV: pop stack frame
  KV->>CPU: sret

  CPU->>K: resume interrupted kernel code

  Note over K,CPU: Mode = S<br/>satp = kernel page table

Initializes the global timer tick lock.

  • ticks is the global count of timer ticks since boot
  • tickslock protects the ticks counter
  • trapinit only initializes tickslock
  • The trap vector itself is not installed here
  • Per-hart trap-vector setup happens in trapinithart

tickslock matters because timer interrupts and system calls both touch ticks.

  • clockintr increments ticks on CPU 0 during timer interrupts
  • clockintr wakes processes sleeping on &ticks
  • sys_pause reads ticks, then sleeps until enough ticks pass
  • sys_uptime reads ticks to report time since boot
  • The lock prevents races between timer interrupt updates and process reads or sleeps

After this, the global tick counter has a lock, but traps are not yet routed to a handler.

kernelvec is the assembly entry path for traps that happen while the CPU is already in the kernel.

  • The current stack is already a kernel stack
  • kernelvec makes space on that stack
  • Caller-saved registers are saved to the stack
  • kerneltrap is called in C
  • Registers are restored after kerneltrap returns
  • sret returns to the interrupted kernel code

After this, kernel-mode interrupts and exceptions on this hart have a valid entry path.