10 binit

binit initializes the buffer cache — a fixed pool of in-memory copies of disk blocks. Disk I/O is extremely slow compared to memory access, so instead of reading from disk every time, the kernel keeps recently-used blocks cached in RAM.

The data structure

struct {
  struct spinlock lock;
  struct buf buf[NBUF];
  struct buf head;
} bcache;

NBUF is 30. So there are 30 buffer slots, period. Each struct buf holds one disk block worth of data plus metadata — which device, which block number, whether it’s valid (contains real data from disk), a reference count, and a sleeplock.

The buffers are organized into a doubly-linked list sorted by recency. head is a dummy node — it doesn’t hold data, it’s just the anchor. head.next is the most recently used buffer, head.prev is the least recently used.

What binit does

First it initializes the spinlock protecting the cache. Then it sets up the linked list. It starts by making head point to itself in both directions — an empty circular list. Then it loops through all 30 buffers and inserts each one right after head, initializing each buffer’s sleeplock along the way. After the loop, all 30 buffers are linked together in a circular doubly-linked list hanging off head. None of them contain valid data yet — they’re all empty slots waiting to be used.

How the cache works (the rest of the file)

bread is the main entry point — “give me the contents of block N from device D.” It calls bget, which first scans the list looking for a buffer that already holds that block. If found, it increments the reference count and returns it — a cache hit, no disk I/O needed.

If the block isn’t cached, bget needs to evict something. It scans from the tail of the list (least recently used) looking for a buffer with refcnt == 0 — meaning nobody is currently using it. It repurposes that buffer for the new block, marks it invalid (the data is stale, from whatever block was previously cached there), and returns it. Back in bread, seeing valid == 0, it calls virtio_disk_rw to actually read the block from disk, then marks it valid.

This is classic LRU (Least Recently Used) caching. The list ordering tracks recency, and eviction always picks the oldest unused buffer.

bwrite writes a buffer’s contents back to disk. The caller must hold the buffer’s sleeplock.

brelse releases a buffer when you’re done with it. It decrements the reference count, and if nobody else is using the buffer, it moves it to the front of the list — marking it as most recently used. This is what maintains the LRU ordering. Buffers you touched recently drift to the front, buffers nobody has touched in a while drift to the back, and eviction always grabs from the back.

Why sleeplocks instead of spinlocks

Each buffer has a sleeplock, not a spinlock. Disk I/O is slow — milliseconds. If a process holds a buffer while waiting for a disk read to complete, you don’t want another process spinning the CPU waiting for it. A sleeplock lets the waiting process sleep and yield the CPU, then get woken up when the buffer is released. Spinlocks are for short critical sections (microseconds). Sleeplocks are for long ones (milliseconds).

The bcache.lock spinlock protects the list structure itself — scanning the list, moving nodes around. That’s fast. The per-buffer sleeplocks protect the buffer contents — reading from disk, modifying data, writing back. That’s slow.

bpin and bunpin

Simple reference count manipulation. Some code needs to keep a buffer pinned in the cache (so it doesn’t get evicted) without holding the sleeplock. These bump the reference count up and down under the cache spinlock.

My Knowledge Base

Explorer

10 binit