Initializes the buffer cache.

  • The buffer cache stores cached copies of disk blocks in memory
  • NBUF is MAXOPBLOCKS * 3, so this source has 30 buffers

bcache is the global buffer-cache state.

FieldMeaning
lockProtects cache-wide metadata.
buf[NBUF]Fixed array of 30 buffer objects.
headDummy list node for the LRU list.
head.nextPoints toward the most recently used buffer.
head.prevPoints toward the least recently used buffer.

Each struct buf represents one cached disk block.

FieldMeaning
validSays whether data contains real disk contents.
diskSet while the virtio disk driver owns the buffer during I/O.
devDisk device number.
blocknoBlock number on that device.
lockSleeplock for this buffer’s contents.
refcntNumber of active users of this buffer.
prev, nextLinks for the LRU list.
data[BSIZE]Cached disk block contents.

bcache has two kinds of locks.

  • bcache.lock is a spinlock for the cache-wide linked list and reference counts
  • Each buffer has its own sleeplock b->lock
  • The spinlock protects short metadata updates
  • The sleeplock protects one buffer’s block contents while code may sleep during disk I/O

binit builds the cache list.

  • bcache.lock is initialized
  • bcache.head is set up as a dummy list node
  • Each buffer is inserted after bcache.head
  • Each buffer’s sleeplock is initialized

The buffer cache is used later by block reads and writes.

  • bget looks for a cached block by (dev, blockno)
  • If the block is missing, bget recycles the least recently used unused buffer
  • bread returns a locked buffer and reads from disk if the buffer is not valid
  • bwrite writes a locked buffer to disk
  • brelse releases a buffer and moves it to the most recently used end when no one else holds it

After this, the block cache has RAM buffers ready. Actual disk I/O still waits for virtio_disk_init, because the virtio driver is what tells the disk device to transfer block contents into those buffers.

File System Layers Diagram

---
config:
  layout: dagre
---
flowchart TB

  classDef file fill:#FFFFFF,stroke:#111,stroke-width:3px,color:#111
  classDef source fill:#E9F1FF,stroke:#111,stroke-width:2px,color:#111
  classDef iface fill:#EDE7D4,stroke:#111,stroke-width:2px,color:#111
  classDef backend fill:#F8F8F8,stroke:#111,stroke-width:2px,color:#111

 subgraph MF["inode-table interface / contract"]
    direction TB
        LOCK_FUNCS["inode locking interface<br/><br/>ilock<br/>iunlock"]:::iface
        ICACHE_FUNCS["inode reference/cache interface<br/><br/>iinit<br/>iget<br/>idup<br/>iput<br/>iunlockput<br/>ireclaim"]:::iface
 end

 subgraph M["in-memory inode state"]
    direction LR
        ICACHE["itable<br/><br/>in-use cached inodes"]:::source
        INODE["struct inode<br/><br/>dev<br/>inum<br/>ref<br/>lock<br/>valid<br/>type<br/>major / minor<br/>nlink<br/>size<br/>addrs[]"]:::file
        SB_MEM["struct superblock sb<br/><br/>cached disk-layout metadata"]:::source
 end

 subgraph BF["files / directories / names interface"]
    direction TB
        PATH_FUNCS["Names layer interface<br/><br/>skipelem<br/>namex<br/>namei<br/>nameiparent"]:::iface
        DIR_FUNCS["Directories layer interface<br/><br/>namecmp<br/>dirlookup<br/>dirlink"]:::iface
        DATA_FUNCS["Files layer: content interface<br/><br/>bmap<br/>readi<br/>writei"]:::iface
        META_FUNCS["Files layer: inode interface<br/><br/>ialloc<br/>iupdate<br/>itrunc<br/>stati"]:::iface
 end

 subgraph LF["log transaction interface"]
    direction TB
        LOG_FUNCS["transaction boundary interface<br/><br/>initlog<br/>begin_op<br/>end_op"]:::iface
        LOG_WRITE_FUNCS["physical redo-log interface<br/><br/>log_write<br/>commit<br/>write_log<br/>write_head<br/>install_trans<br/>recover_from_log<br/>read_head"]:::iface
 end

 subgraph LM["in-memory log state"]
    direction LR
        LOG_STATE["struct log<br/><br/>lock<br/>start<br/>outstanding<br/>committing<br/>dev<br/>lh"]:::source
        LOG_HDR_MEM["logheader in memory<br/><br/>n<br/>block[]"]:::source
 end

 subgraph DF["raw block + superblock interface"]
    direction TB
        SUPER_FUNCS["superblock loading interface<br/><br/>readsb<br/>fsinit"]:::iface
        BLOCK_FUNCS["raw block allocator interface<br/><br/>bzero<br/>balloc<br/>bfree"]:::iface
 end

 subgraph BCI["buffer cache interface"]
    direction TB
        BC_FUNCS["buffer cache API interface<br/><br/>binit<br/>bread<br/>bwrite<br/>brelse<br/>bpin<br/>bunpin"]:::iface
 end

 subgraph BCS["buffer cache state"]
    direction LR
        BCACHE["bcache LRU list<br/><br/>cached disk blocks in memory"]:::source
        BUF["struct buf<br/><br/>valid<br/>disk<br/>dev<br/>blockno<br/>refcnt<br/>lock<br/>prev / next<br/>data[BSIZE]"]:::source
 end

 subgraph D["on-disk file-system source + device"]
    direction LR
        SUPER_DISK["super block<br/><br/>disk layout metadata"]:::source
        LOG_DISK["log blocks<br/><br/>header + logged block copies"]:::source
        DINODE["inode blocks<br/><br/>array of struct dinode"]:::source
        BITMAP["free bitmap blocks<br/><br/>block allocation bits"]:::source
        DATA_BLOCKS["data blocks<br/><br/>file bytes<br/>directory dirents<br/>indirect blocks"]:::source
        DISK_DEVICE["virtio block device"]:::backend
 end

    MF -->|keeps references in| M

    BF -->|uses locked inodes from| M
    BF -->|wraps modifying calls in| LF
    BF -->|gets disk blocks via| BC_FUNCS

    DF -->|logs bitmap/superblock writes through| LF
    DF -->|gets raw disk blocks via| BC_FUNCS

    LF -->|tracks active transaction in| LM
    LF -->|write_log writes home blocks to log<br>install_trans copies committed blocks back home| BC_FUNCS

    BC_FUNCS -->|returns locked buffers from| BCACHE
    BCACHE -->|linked list of| BUF
    BUF -->|transferred by virtio_disk_rw<br>reads from and writes to| DISK_DEVICE

    ICACHE -->|holds| INODE
    LOG_STATE -->|contains| LOG_HDR_MEM

struct dinode is not a .bss object.

  • struct dinode is only the on-disk inode layout
  • The type definition does not allocate global RAM by itself
  • Disk inode bytes live inside fs.img
  • When a disk inode block is read, those bytes sit inside buf.data
  • Filesystem code can then treat part of buf.data as a struct dinode
  • struct inode is the cached in-memory working copy stored in itable.inode[NINODE]

struct dinode is the on-disk inode format.

FieldMeaning
typeFile type. A zero type means the disk inode is free.
major, minorDevice numbers for device files.
nlinkNumber of directory links to this inode.
sizeFile size in bytes.
addrs[NDIRECT+1]Direct block addresses plus one indirect block address.

itable is the global inode-table state.

FieldMeaning
lockProtects allocation and lookup of inode-table entries.
inode[NINODE]Fixed array of in-memory inode objects.

Each struct inode is an in-memory copy of one on-disk inode.

FieldMeaning
devDevice containing the inode.
inumInode number on that device.
refNumber of active references to this in-memory inode.
lockSleeplock for this inode’s contents.
validSays whether inode contents have been loaded from disk.
typeFile type copied from disk.
major, minorDevice numbers for device inodes.
nlinkNumber of directory links to this inode.
sizeFile size in bytes.
addrs[NDIRECT+1]Direct and indirect data-block addresses.

Boot Sequence

The Filesystem

Filesystem image layout:

As xv6 boots, the build creates the filesystem image.

  • User programs are compiled first, such as _init, _sh, and other commands
  • mkfs runs on the host machine, not inside xv6
  • mkfs creates fs.img, the virtual disk image passed to QEMU
  • fs.img contains the xv6 filesystem layout
  • Compiled user programs are packed into fs.img
  • /init exists inside fs.img before the kernel ever boots
  • QEMU starts xv6 with both the kernel image and fs.img

Root directory inode and its directory block:

Example user program inode:

Later, the virtio driver reads fs.img, and the first process execs /init.