Virtual memory divides physical memory into blocks, allocating them to distinct processes to enable memory sharing and process protection.
Relocation: The hardware dynamically maps virtual addresses to physical addresses, allowing a program to be loaded anywhere in physical main memory or secondary storage.
Page Fault / Address Fault: Occurs when a referenced item is not present in main memory. The missing block is moved from disk to memory.
Page faults are managed entirely by the operating system (OS) in software. The processor performs a context switch to another task while the high-latency disk access completes.
Address Space Structuring: Paging vs. Segmentation
Paging: Divides the address space into fixed-size blocks called pages (typically 4096–8192 bytes).
Addresses consist of a virtual page number and a page offset.
Causes internal fragmentation (unused memory within an allocated page).
Segmentation: Divides the address space into variable-size blocks called segments (ranging from 1 byte to 232 bytes).
Addresses require two distinct words: a segment number and an offset within the segment.
Causes external fragmentation and makes block replacement difficult due to contiguous memory requirements.
Hybrid Approaches:
Paged segments: Segments are composed of an integral number of pages, simplifying replacement.
Multiple page sizes: Supports base pages and larger sizes structured as powers of 2 multiples of the base size.
The Four Memory Hierarchy Questions Applied to Virtual Memory
Placement (Fully Associative): Because disk miss penalties are exorbitant, operating systems minimize miss rates by placing pages anywhere in main memory.
Identification (Page Tables): The OS locates blocks using a data structure indexed by the virtual page number, yielding the physical page address.
Paging: The offset is directly concatenated to the physical page address.
Segmentation: The offset is added to the segment’s physical base address.
Inverted Page Table: Applies a hashing function to the virtual address, reducing the page table size to the number of physical pages rather than virtual pages.
Replacement (Least Recently Used - LRU): The OS approximates LRU replacement to minimize page faults.
Processors provide a use bit or reference bit that is set upon access.
The OS periodically clears and records these bits to identify unreferenced pages for replacement.
Write Strategy (Write-Back): Writes update main memory and are only written to disk upon replacement.
A dirty bit tracks if the page has been altered, ensuring only modified blocks are written to the high-latency disk.
Translation Lookaside Buffer (TLB)
Page tables reside in main memory. Address translation directly through a page table doubles memory access time (one access for translation, one for data).
A TLB is a dedicated cache for address translations that bypasses the secondary memory access for the majority of references.
TLB entries contain: a tag (virtual address bits), the physical page frame number, protection fields, a valid bit, a use bit, and a dirty bit.
A dirty bit in the TLB indicates that the corresponding memory page has been modified, not that the TLB entry itself is modified.
Cache Optimization with Virtual Memory
Virtual Caches
Virtual caches use virtual addresses for both the index and the tag comparison, eliminating address translation time on a cache hit.
Challenges of Virtual Caches:
Context Switches: Virtual addresses map to different physical addresses across processes, typically requiring cache flushes.
Aliasing/Synonyms: Multiple virtual addresses mapping to the same physical address can result in inconsistent data copies within the cache.
Hardware & Software Solutions:
Process-Identifier Tags (PID): Appending a PID to the cache tag allows the cache to retain data across process switches without flushing.
Antialiasing Hardware: Guarantees a unique physical address for every cache block.
Page Coloring: Software restricts aliases to share identical lower address bits, effectively increasing the page offset.
VIPT caches use the page offset (which is identical in both virtual and physical addresses) to index the cache.
The virtual portion of the address is translated by the TLB in parallel with the cache read. The resulting physical address is used for the tag comparison.
Sizing Constraint: A direct-mapped VIPT cache cannot exceed the page size. To support a larger cache without translating the index, associativity must be increased according to the index formula: 2Index=Block size×Set associativityCache size
Selecting a Page Size
Advantages of Larger Page Sizes:
Decreases page table size.
Permits larger virtually indexed caches.
Increases secondary storage transfer efficiency.
Reduces TLB misses by mapping more memory per entry.
Advantages of Smaller Page Sizes:
Reduces internal fragmentation (wasted memory within a page).
Decreases process start-up time.
Process Protection and Sharing
Process: A running program and its requisite execution state. Multi-programming requires rapid context switches between processes.
Protection Boundaries: Dedicated page tables per process prevent unauthorized interference.
Protection Rings: Concentric security levels. Inner kernel rings access all outward data; outer civilian rings have highly restricted access.
Capabilities: Unforgeable keys explicitly passed between programs to securely grant access rights.
Architectures for Virtual Memory
IA-32 (Intel Pentium): Segmented Virtual Memory
Protection Levels: Implements four levels of protection rings, utilizing separate stacks for each level to prevent security breaches.
Descriptor Tables: Segment registers hold an index to a descriptor table rather than a base address. Half the address space is shared (Global Descriptor Table) and half is private (Local Descriptor Table).
Segment Descriptors (PTE equivalent): Contain a Present bit, Base field, Access bit, Attributes, and a Limit field that enforces strict upper-bound offset checks.
Call Gates: Special segment descriptors that define strictly controlled entry points for executing more privileged code.
Call gates safely transfer parameters across privilege boundary stacks based on a descriptor’s word count field.
Hardware sets a requested privilege level field to prevent the OS from utilizing trusted access on behalf of untrusted parameters.
AMD64 (Opteron): Paged Virtual Memory
Address Space: Utilizes a flat, 64-bit address space with segment bases set to zero. Implements 48-bit virtual addresses mapped to 40-bit physical addresses.
Canonical Form: Hardware requires the upper 16 bits of a virtual address to be the sign extension of the lower 48 bits.
Page Tables: Employs a 4-level hierarchical page table structure. Each table fits exactly within a 4 KiB page.
Page Table Entry (PTE): 64-bit entries containing a 52-bit physical page frame and 12 bits of state.
State fields include: Presence, Read/Write, User/Supervisor, Dirty, Accessed, Page Size, No Execute, Cache Disable, and Write-Through.
TLB Hierarchy: Uses four TLBs (two L1 and two larger L2 structures) separated for instructions and data to minimize the translation penalty incurred by 4-level page walks.