Fundamentals of Virtual Memory

  • Virtual memory divides physical memory into blocks, allocating them to distinct processes to enable memory sharing and process protection.
  • Relocation: The hardware dynamically maps virtual addresses to physical addresses, allowing a program to be loaded anywhere in physical main memory or secondary storage.
  • Page Fault / Address Fault: Occurs when a referenced item is not present in main memory. The missing block is moved from disk to memory.
  • Page faults are managed entirely by the operating system (OS) in software. The processor performs a context switch to another task while the high-latency disk access completes.

Address Space Structuring: Paging vs. Segmentation

  • Paging: Divides the address space into fixed-size blocks called pages (typically 4096–8192 bytes).
    • Addresses consist of a virtual page number and a page offset.
    • Causes internal fragmentation (unused memory within an allocated page).
  • Segmentation: Divides the address space into variable-size blocks called segments (ranging from 1 byte to bytes).
    • Addresses require two distinct words: a segment number and an offset within the segment.
    • Causes external fragmentation and makes block replacement difficult due to contiguous memory requirements.
  • Hybrid Approaches:
    • Paged segments: Segments are composed of an integral number of pages, simplifying replacement.
    • Multiple page sizes: Supports base pages and larger sizes structured as powers of 2 multiples of the base size.

The Four Memory Hierarchy Questions Applied to Virtual Memory

  • Placement (Fully Associative): Because disk miss penalties are exorbitant, operating systems minimize miss rates by placing pages anywhere in main memory.
  • Identification (Page Tables): The OS locates blocks using a data structure indexed by the virtual page number, yielding the physical page address.
    • Paging: The offset is directly concatenated to the physical page address.
    • Segmentation: The offset is added to the segment’s physical base address.
    • Inverted Page Table: Applies a hashing function to the virtual address, reducing the page table size to the number of physical pages rather than virtual pages.
  • Replacement (Least Recently Used - LRU): The OS approximates LRU replacement to minimize page faults.
    • Processors provide a use bit or reference bit that is set upon access.
    • The OS periodically clears and records these bits to identify unreferenced pages for replacement.
  • Write Strategy (Write-Back): Writes update main memory and are only written to disk upon replacement.
    • A dirty bit tracks if the page has been altered, ensuring only modified blocks are written to the high-latency disk.

Translation Lookaside Buffer (TLB)

  • Page tables reside in main memory. Address translation directly through a page table doubles memory access time (one access for translation, one for data).
  • A TLB is a dedicated cache for address translations that bypasses the secondary memory access for the majority of references.
  • TLB entries contain: a tag (virtual address bits), the physical page frame number, protection fields, a valid bit, a use bit, and a dirty bit.
  • A dirty bit in the TLB indicates that the corresponding memory page has been modified, not that the TLB entry itself is modified.

Cache Optimization with Virtual Memory

Virtual Caches

  • Virtual caches use virtual addresses for both the index and the tag comparison, eliminating address translation time on a cache hit.
  • Challenges of Virtual Caches:
    • Context Switches: Virtual addresses map to different physical addresses across processes, typically requiring cache flushes.
    • Aliasing/Synonyms: Multiple virtual addresses mapping to the same physical address can result in inconsistent data copies within the cache.
  • Hardware & Software Solutions:
    • Process-Identifier Tags (PID): Appending a PID to the cache tag allows the cache to retain data across process switches without flushing.
    • Antialiasing Hardware: Guarantees a unique physical address for every cache block.
    • Page Coloring: Software restricts aliases to share identical lower address bits, effectively increasing the page offset.

Virtually Indexed, Physically Tagged (VIPT) Caches

  • VIPT caches use the page offset (which is identical in both virtual and physical addresses) to index the cache.
  • The virtual portion of the address is translated by the TLB in parallel with the cache read. The resulting physical address is used for the tag comparison.
  • Sizing Constraint: A direct-mapped VIPT cache cannot exceed the page size. To support a larger cache without translating the index, associativity must be increased according to the index formula:

Selecting a Page Size

  • Advantages of Larger Page Sizes:
    • Decreases page table size.
    • Permits larger virtually indexed caches.
    • Increases secondary storage transfer efficiency.
    • Reduces TLB misses by mapping more memory per entry.
  • Advantages of Smaller Page Sizes:
    • Reduces internal fragmentation (wasted memory within a page).
    • Decreases process start-up time.

Process Protection and Sharing

  • Process: A running program and its requisite execution state. Multi-programming requires rapid context switches between processes.
  • Protection Boundaries: Dedicated page tables per process prevent unauthorized interference.
  • Protection Rings: Concentric security levels. Inner kernel rings access all outward data; outer civilian rings have highly restricted access.
  • Capabilities: Unforgeable keys explicitly passed between programs to securely grant access rights.

Architectures for Virtual Memory

IA-32 (Intel Pentium): Segmented Virtual Memory

  • Protection Levels: Implements four levels of protection rings, utilizing separate stacks for each level to prevent security breaches.
  • Descriptor Tables: Segment registers hold an index to a descriptor table rather than a base address. Half the address space is shared (Global Descriptor Table) and half is private (Local Descriptor Table).
  • Segment Descriptors (PTE equivalent): Contain a Present bit, Base field, Access bit, Attributes, and a Limit field that enforces strict upper-bound offset checks.
  • Call Gates: Special segment descriptors that define strictly controlled entry points for executing more privileged code.
    • Call gates safely transfer parameters across privilege boundary stacks based on a descriptor’s word count field.
    • Hardware sets a requested privilege level field to prevent the OS from utilizing trusted access on behalf of untrusted parameters.

AMD64 (Opteron): Paged Virtual Memory

  • Address Space: Utilizes a flat, 64-bit address space with segment bases set to zero. Implements 48-bit virtual addresses mapped to 40-bit physical addresses.
  • Canonical Form: Hardware requires the upper 16 bits of a virtual address to be the sign extension of the lower 48 bits.
  • Page Tables: Employs a 4-level hierarchical page table structure. Each table fits exactly within a 4 KiB page.
  • Page Table Entry (PTE): 64-bit entries containing a 52-bit physical page frame and 12 bits of state.
    • State fields include: Presence, Read/Write, User/Supervisor, Dirty, Accessed, Page Size, No Execute, Cache Disable, and Write-Through.
  • TLB Hierarchy: Uses four TLBs (two L1 and two larger L2 structures) separated for instructions and data to minimize the translation penalty incurred by 4-level page walks.