Virtual Memory and Protection

Virtual Memory Basics

Physical memory functions as a cache for secondary storage, transferring pages between memory hierarchy levels.
Translation Lookaside Buffers (TLBs) act as caches for the page table, eliminating the need to perform a main memory access for every address translation.
Virtual memory provides separation and privacy between processes that share physical memory by granting each process a distinct virtual address space.

A process consists of a running program and the state required to continuously execute it.
Hardware architectures and operating systems must enforce four specific mechanisms to allow safe sharing of hardware:
- Execution Modes: The architecture must provide at least two modes to differentiate between a user process and a supervisor (kernel) process.
- Protected State: A user process must be restricted from modifying specific processor states, including the user/supervisor mode bit, exception enable/disable bits, and memory protection boundaries.
- Mode Transitions: Processors require special system call instructions that save the Program Counter (PC), transfer control to a dedicated supervisor address, and elevate privilege to supervisor mode.
- Memory Isolation: Mechanisms must restrict memory accesses to prevent user processes from modifying the memory state of other processes without relying on disk swapping during context switches.

Memory protection is heavily reliant on mapping fixed-sized virtual memory pages (e.g., $4 KiB$ or $16 KiB$ ) to physical addresses using a page table.
Each Page Table Entry (PTE) contains protection restrictions determining if a user process can read, write, or execute the given page.
Total access protection is guaranteed because a process cannot access a page missing from the page table, and only the operating system is permitted to update page table entries.
Because paged virtual memory intrinsically requires two memory accesses (one for translation, one for data), TLBs cache address translations to preserve performance.
A TLB entry stores a tag containing virtual address bits, alongside data fields holding the physical page address, protection restrictions, a valid bit, a use bit, and a dirty bit.
When the operating system alters page restrictions, it modifies the page table and invalidates the corresponding TLB entry.

Functional isolation enforced by page tables and operating systems can be circumvented via hardware side-channels.
Side-channel attacks perturb shared hardware components not explicitly protected by virtual memory (e.g., caches) and observe the effects using processor timers or performance counters.
Prime and Probe Attack:
- Given target logic: if ($x \le 0$) {access P} else {access Q}.
- Prime: The attacker overwrites cache lines corresponding to locations $P$ and $Q$ with its own data.
- Execute: The victim executes the target logic, generating a cache miss and loading either $P$ or $Q$ into the shared cache.
- Probe: The attacker accesses $P$ and $Q$ while measuring access times.
- A cache miss on $P$ indicates the victim’s code execution replaced the attacker’s data at $P$ , definitively leaking the state $x \leq 0$ .
Attack Requirements and Amplifiers:
- The attacker must know the precise memory locations of the victim’s code.
- The victim must execute between the prime and probe steps without intervening actions polluting the cache.
- Shared Last-Level Caches (LLC) across multi-core processors, or hardware multithreading within the same core, drastically increases the bandwidth and effectiveness of side-channel attacks.

Randomized Page Allocation: Allocating pages randomly complicates the attacker’s ability to map victim code segments to specific physical cache locations.
Timing Obfuscation: Inserting random, short computational delays into critical code segments masks the timing signals relied upon by attackers.
Cache Flushing: Flushing caches entirely upon context switches eliminates implicit sharing but incurs severe performance penalties on large LLCs.
Resource Partitioning: Dividing the LLC into discrete segments for simultaneously active processes stops cache state leakage, though it reduces performance for memory-heavy applications.