What qemu to use?
What toolkit to use?
- cc: riscv compiler
- gas: assembly code into object files
- linker
- objcopy - convert ELF into binary
- objdump - inspect
These are the compiler flags used when building xv6 C code. They tell GCC:
compile strict C,
for 64-bit RISC-V,
without assuming a normal operating system or libc,
and keep debug info useful for kernel debugging.Here is the breakdown.
Warning and debugging flags
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-Wall | Enable many common compiler warnings. | Catches suspicious code early. |
-Werror | Treat warnings as errors. | Forces the kernel to build cleanly. |
-Wno-unknown-attributes | Do not warn about unknown attributes. | Avoids toolchain-version noise. |
-O | Enable basic optimization. | Produces reasonable code without aggressive optimization. |
-fno-omit-frame-pointer | Keep frame pointer registers. | Makes stack traces and debugging easier. |
-ggdb | Generate GDB-friendly debug info. | Helps debug xv6 with GDB. |
-gdwarf-2 | Use DWARF version 2 debug format. | Keeps debug info compatible/simple. |
The debug-related ones matter a lot in xv6 because you often debug at the assembly/register level.
Target architecture
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-march=rv64gc | Generate code for 64-bit RISC-V with common extensions. | xv6-riscv runs on a 64-bit RISC-V machine. |
rv64gc means:
rv64 = 64-bit RISC-V
g = general-purpose extension set
c = compressed instruction extensionThe g group includes common extensions such as integer multiply/divide, atomics, and floating-point-related baseline extensions. xv6 mostly cares that this matches the QEMU RISC-V CPU/toolchain expectations.
Dependency generation
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-MD | Generate .d dependency files while compiling. | Lets make know which headers each .o depends on. |
Example:
kernel/proc.c
includes kernel/types.h
includes kernel/param.h
includes kernel/proc.hWith -MD, GCC emits a dependency file so that if proc.h changes, proc.o rebuilds automatically.
RISC-V code model
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-mcmodel=medany | Generate code that can run from a wider range of addresses. | xv6 is linked at 0x80000000, not near address zero. |
This one is important.
Normal code-generation assumptions may expect code/data to be reachable using certain address ranges. xv6’s kernel lives at a high physical address:
0x80000000medany tells the compiler to generate address calculations suitable for code located in a medium-sized address range, not assuming everything is near zero.
Without the correct code model, generated RISC-V addressing sequences may not work correctly for the kernel’s link address.
Freestanding kernel environment
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-ffreestanding | Compile for a freestanding environment, not hosted C. | The kernel is the OS; there is no libc/normal runtime underneath. |
-nostdlib | Do not link against standard libraries/startup files. | xv6 provides its own runtime, syscalls, printing, memory helpers, etc. |
This is one of the biggest conceptual differences from normal C programs.
A normal C program is “hosted”:
program runs inside Linux/macOS
libc exists
startup code calls main()
malloc/printf/memcpy exist
OS provides servicesxv6 kernel code is freestanding:
no libc
no normal program startup
no host OS underneath
no default malloc/printf/memcpy
kernel provides its own worldGlobal variable behavior
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-fno-common | Tentative global definitions become real definitions, not mergeable common symbols. | Catches accidental duplicate global variables. |
This helps prevent bugs like putting this in a header:
int counter;and including it in many .c files.
With stricter behavior, duplicate definitions are more likely to fail at link time instead of being silently merged.
Better style:
// header
extern int counter;
// one .c file
int counter;For kernel code, this is good because accidental duplicate globals are nasty.
Disable compiler built-ins
These flags are all variations of the same idea:
-fno-builtin-strncpy
-fno-builtin-strncmp
-fno-builtin-strlen
-fno-builtin-memset
-fno-builtin-memmove
-fno-builtin-memcmp
-fno-builtin-log
-fno-builtin-bzero
-fno-builtin-strchr
-fno-builtin-exit
-fno-builtin-malloc
-fno-builtin-putc
-fno-builtin-free
-fno-builtin-memcpy
-fno-builtin-printf
-fno-builtin-fprintf
-fno-builtin-vprintfThey tell GCC:
Do not treat these names as special compiler-known library functions.
Use xv6’s definitions or normal calls instead.Why?
Because GCC knows many standard C library functions by name. Even without including libc headers, the compiler may optimize calls to functions like memcpy, strlen, printf, or malloc based on assumptions from normal C environments.
But xv6 is not a normal C environment.
xv6 has its own implementations of things like:
memset
memmove
memcmp
strlen
printf
malloc/free in user spaceThe compiler must not silently replace or reinterpret these calls using hosted-libc assumptions.
Example problem:
memcpy(dst, src, n);A normal compiler might think:
I know what memcpy means.
I can optimize this specially.
I may emit inline instructions or assume libc semantics.xv6 says:
No. Treat memcpy as xv6’s function, not as a magical builtin.So these flags prevent unwanted compiler cleverness.
-Wno-main
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-Wno-main | Do not warn that main has an unusual signature or usage. | Kernel/user startup code may not match normal hosted C expectations. |
In normal C programs, main has expected signatures like:
int main(void)
int main(int argc, char **argv)But in kernel or tiny user-runtime contexts, startup conventions can be different. xv6 disables this warning.
Include path
| Flag | Meaning | Why xv6 uses it |
|---|---|---|
-I. | Add current directory to header search path. | Allows includes relative to xv6 source root. |
This lets code include headers from the project tree cleanly.
For example:
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"depending on which file is being compiled.
Stack protector check
CFLAGS += $(shell $(CC) -fno-stack-protector -E -x c /dev/null >/dev/null 2>&1 && echo -fno-stack-protector)This means:
Ask the compiler: do you support -fno-stack-protector?
If yes, add -fno-stack-protector to CFLAGS.
If no, add nothing.So the actual added flag is usually:
-fno-stack-protectorWhat is stack protector?
Modern compilers often add stack-smashing protection to functions. They insert hidden checks using a “stack canary.”
Normal compiled code might become:
function starts
put secret canary on stack
function returns
check canary was not overwritten
if overwritten, call failure handlerThat is useful in normal applications.
But in xv6, this creates a problem: the compiler may expect runtime support functions such as stack-check failure handlers. xv6 does not provide a normal libc/runtime environment.
So xv6 disables it.
Conceptually:
Do not insert hidden stack-protection runtime calls.
This is a tiny kernel with its own runtime.Compact table
| Flag | Short meaning |
|---|---|
-Wall | Enable many warnings. |
-Werror | Warnings become errors. |
-Wno-unknown-attributes | Ignore unknown attribute warnings. |
-O | Basic optimization. |
-fno-omit-frame-pointer | Keep frame pointers for debugging. |
-ggdb | GDB debug info. |
-gdwarf-2 | DWARF v2 debug format. |
-march=rv64gc | Target 64-bit RISC-V. |
-MD | Generate dependency files. |
-mcmodel=medany | Addressing model suitable for high kernel address. |
-ffreestanding | No hosted C environment assumptions. |
-fno-common | Catch duplicate global definitions. |
-nostdlib | Do not link standard library/runtime. |
-fno-builtin-* | Do not treat libc names as compiler built-ins. |
-Wno-main | Do not warn about nonstandard main. |
-I. | Search current source tree for headers. |
-fno-stack-protector | Avoid hidden stack canary runtime dependency. |
The big picture:
These flags make GCC behave like a kernel compiler:
strict warnings,
RISC-V target,
debuggable output,
no libc assumptions,
no hidden runtime dependencies,
and address generation suitable for xv6’s memory layout.Some GCC toolchains build position-independent executables by default. xv6 does not want that. If this compiler supports disabling PIE, add the right flags.
Compact summary
| Piece | Meaning |
|---|---|
PIE | Position Independent Executable |
-fno-pie | Compiler: do not generate PIE-style code |
-no-pie | Linker/driver: do not link as PIE |
-nopie | Older/alternate spelling of -no-pie |
-dumpspecs | Ask GCC what options/default specs it knows |
ifneq (...,) | If shell command output is non-empty |
| Purpose | Keep xv6 fixed-address and simple |
Final mental model:
Modern Linux GCC may default to PIE for security. xv6 needs fixed-address kernel/user binaries. So the Makefile detects whether the compiler supports disabling PIE and adds the right no-PIE flags.
Linker Flags now
LDFLAGS = -z max-page-size=4096This is a linker flag. It tells the linker:
When laying out the final binary, use 4096 bytes as the maximum page size.In xv6 terms:
4096 bytes = 0x1000 = one xv6/RISC-V pageWhy does the linker care about page size?
When the linker builds an ELF file, it creates loadable segments such as:
text/code segment
read-only data segment
data segment
bss segmentELF segments have alignment requirements. On some toolchains, the linker may choose a large default maximum page size, like:
2 MiBThat can make the linker insert huge padding/alignment gaps between parts of the kernel image.
For xv6, that is annoying or wrong because xv6 expects a simple 4 KiB page model.
So this flag says:
Do not align ELF segments using some huge default page size.
Use 4096-byte pages.Why 4096?
Because xv6 uses 4 KiB pages:
#define PGSIZE 4096So this linker flag matches the kernel’s memory/page-table model:
xv6 page size = 4096 bytes
linker page size = 4096 bytes
hardware page size = 4096 bytesThat keeps the kernel image layout compact and predictable.
What could happen without it?
Without this flag, the linker might create a kernel ELF where segments are aligned to a larger page boundary.
Conceptually:
.text
↓
huge padding gap
↓
.rodata
↓
huge padding gap
↓
.dataThat can make the kernel image larger than expected or shift sections in ways that make the layout less clean.
For a tiny teaching kernel, xv6 wants:
code
then trampoline page
then rodata
then data
then bssnot:
code
then megabytes of linker padding
then rodata
then more padding
then dataWhat does -z mean?
-z passes a special option to the linker.
So:
-z max-page-size=4096means:
Set the linker’s maximum page size to 4096.This is not a C compiler behavior flag. It affects the link stage, when object files are combined into the final kernel binary.
Compact summary
| Part | Meaning |
|---|---|
LDFLAGS | Flags passed to the linker. |
-z | Linker-specific option prefix. |
max-page-size=4096 | Use 4 KiB max page alignment for ELF segments. |
| Why xv6 wants it | xv6 uses 4 KiB pages and wants compact/predictable layout. |
Mental model:
C/assembly files
↓ compile
object files
↓ link with LDFLAGS
kernel ELF laid out with 4 KiB page alignmentSo this flag keeps the linker’s idea of page alignment consistent with xv6’s 4096-byte page world.
Next Ste4p
This is the Makefile rule that creates the final xv6 kernel binary.
$K/kernel: $(OBJS) $K/kernel.ld
$(LD) $(LDFLAGS) -T $K/kernel.ld -o $K/kernel $(OBJS)
$(OBJDUMP) -S $K/kernel > $K/kernel.asm
$(OBJDUMP) -t $K/kernel | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > $K/kernel.symRead it as:
To build kernel/kernel,
you need all kernel object files
and the kernel linker script.First line: the rule header
$K/kernel: $(OBJS) $K/kernel.ldSince:
K=kernelthis means:
kernel/kernel: $(OBJS) kernel/kernel.ldSo the target is:
kernel/kernelThat is the final linked kernel executable.
The dependencies are:
all kernel object files
kernel/kernel.ldSo if any .o file changes, or if kernel.ld changes, Make rebuilds kernel/kernel.
Conceptually:
kernel/*.o + kernel/kernel.ld
↓
kernel/kernelSecond line: link the kernel
$(LD) $(LDFLAGS) -T $K/kernel.ld -o $K/kernel $(OBJS)Expands roughly to:
riscv64-unknown-elf-ld \
-z max-page-size=4096 \
-T kernel/kernel.ld \
-o kernel/kernel \
kernel/entry.o kernel/start.o kernel/console.o ...This is the actual linking step.
| Part | Meaning |
|---|---|
$(LD) | The RISC-V linker. |
$(LDFLAGS) | Linker flags, like -z max-page-size=4096. |
-T kernel/kernel.ld | Use xv6’s linker script. |
-o kernel/kernel | Output file name. |
$(OBJS) | All compiled kernel object files. |
The linker combines all kernel .o files into one kernel image.
It also uses kernel.ld to decide:
start at 0x80000000
put _entry first
lay out .text, trampoline, .rodata, .data, .bss
define etext
define endSo this step creates:
kernel/kernelThat is the kernel binary QEMU will load.
Third line: create disassembly
$(OBJDUMP) -S $K/kernel > $K/kernel.asmExpands roughly to:
riscv64-unknown-elf-objdump -S kernel/kernel > kernel/kernel.asmThis does not build the kernel. It creates a human-readable file:
kernel/kernel.asmobjdump -S means:
show disassembly, mixed with source code when debug info is availableSo kernel.asm lets you inspect:
C source
RISC-V assembly generated from it
function addresses
machine-level control flowThis is useful for debugging and learning.
Example use cases:
What assembly did scheduler() compile into?
Where is _entry?
What address is usertrap?
What instruction caused a crash?Fourth line: create symbol table file
$(OBJDUMP) -t $K/kernel | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > $K/kernel.symThis creates:
kernel/kernel.symobjdump -t kernel/kernel prints the symbol table.
The symbol table contains names and addresses, such as:
80000000 _entry
80001234 main
80004567 scheduler
80007890 usertrap
...Then the sed command cleans the output.
Breaking down the symbol command
$(OBJDUMP) -t $K/kernelmeans:
Print the symbol table from kernel/kernel.Then:
sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$/d'does some text filtering.
1,/SYMBOL TABLE/d
Delete everything from line 1 through the line containing SYMBOL TABLE.
So it removes objdump’s header text.
s/ .* / /
Simplify each symbol line by removing extra fields between spaces.
Objdump symbol lines have several columns. xv6 wants a compact address/name style file.
/^$/d
Delete empty lines.
In the Makefile, this appears as:
/^$$/dbecause in Makefiles, $ has special meaning. To pass a literal $ to the shell/sed, Make needs $$.
So:
/^$$/dbecomes this for sed:
/^$/dMeaning:
delete blank linesWhy generate kernel.sym?
The symbol file maps kernel names to addresses.
This is useful when debugging.
For example, if xv6 prints or GDB shows an address:
0x80003f12you can use the symbol table to figure out:
that address is inside usertrap()So:
kernel/kernel.asm = detailed assembly listing
kernel/kernel.sym = compact symbol/address mapFull output of this rule
This Makefile rule produces three important files:
| File | Purpose |
|---|---|
kernel/kernel | Final linked xv6 kernel loaded by QEMU. |
kernel/kernel.asm | Disassembly/source listing for inspection. |
kernel/kernel.sym | Symbol/address table for debugging. |
Big picture
kernel/entry.o
kernel/start.o
kernel/main.o
kernel/proc.o
kernel/vm.o
...
kernel/kernel.ld
↓ linker
kernel/kernel
↓ objdump -S
kernel/kernel.asm
↓ objdump -t + sed
kernel/kernel.symSo this rule is the point where all compiled kernel pieces become one actual bootable kernel image.
Cmpilation
This is a pattern rule for building kernel assembly object files.
$K/%.o: $K/%.S
$(CC) -march=rv64gc -g -c -o $@ $<Since:
K=kernelit means:
kernel/%.o: kernel/%.S
$(CC) -march=rv64gc -g -c -o $@ $<In plain English:
To build any kernel/foo.o,
if there is a matching kernel/foo.S,
compile/assemble kernel/foo.S into kernel/foo.o.Examples:
kernel/entry.S → kernel/entry.o
kernel/swtch.S → kernel/swtch.o
kernel/trampoline.S → kernel/trampoline.o
kernel/kernelvec.S → kernel/kernelvec.oWhat % means
% is a wildcard pattern.
So:
kernel/%.omatches:
kernel/entry.o
kernel/swtch.o
kernel/trampoline.oAnd:
kernel/%.Smeans the matching source file:
kernel/entry.S
kernel/swtch.S
kernel/trampoline.SSo if Make needs kernel/swtch.o, it sees:
kernel/swtch.o: kernel/swtch.Sand runs the command.
What $@ and $< mean
These are automatic Make variables.
| Variable | Meaning | Example for kernel/swtch.o |
|---|---|---|
$@ | Target being built | kernel/swtch.o |
$< | First dependency/input | kernel/swtch.S |
So this command:
$(CC) -march=rv64gc -g -c -o $@ $<becomes:
riscv64-unknown-elf-gcc -march=rv64gc -g -c -o kernel/swtch.o kernel/swtch.SWhy use $(CC) instead of $(AS)?
Even though this is assembly, xv6 uses gcc to build .S files.
That is normal.
There are two common assembly extensions:
| Extension | Meaning |
|---|---|
.s | Raw assembly, sent directly to assembler. |
.S | Assembly that is first run through the C preprocessor. |
Uppercase .S means the file can use preprocessor features like:
#include
#define
#ifdefSo GCC handles the preprocessing step, then invokes the assembler.
That is why this rule uses:
$(CC)rather than directly using:
$(AS)What each flag means
-march=rv64gcGenerate code for 64-bit RISC-V with common extensions.
-gInclude debug information.
-cCompile/assemble only. Do not link.
So the output is an object file:
kernel/foo.onot a final executable.
-o $@Name the output file.
$<Use the source assembly file as input.
Big picture
This rule turns low-level assembly files into object files so the linker can later combine them with C object files:
kernel/entry.S
kernel/swtch.S
kernel/trampoline.S
kernel/kernelvec.S
↓
assembly pattern rule
↓
kernel/entry.o
kernel/swtch.o
kernel/trampoline.o
kernel/kernelvec.o
↓
linker
↓
kernel/kernelSo this rule is specifically for xv6’s low-level assembly parts: boot entry, context switching, trap transition, and kernel trap vector.
Some tags shit:
This Makefile rule builds an Emacs tags file for navigating xv6 source code.
tags: $(OBJS)
etags kernel/*.S kernel/*.cIn plain English:
To build the target named tags,
first make sure the kernel object files exist,
then run etags over kernel assembly and C files.What is tags?
tags is not part of the kernel.
It is a developer convenience target.
When you run:
make tagsit generates a file usually named:
TAGSThat file indexes functions, symbols, and definitions in the source code so an editor can jump around quickly.
For example, in Emacs you can put your cursor on:
schedulerand jump to the definition of scheduler().
What is etags?
etags is a source-code indexing tool used mainly by Emacs.
It scans source files and records where definitions live.
This command:
etags kernel/*.S kernel/*.cmeans:
Scan all kernel assembly files and all kernel C files.
Create a TAGS file for editor navigation.So it includes files like:
kernel/entry.S
kernel/swtch.S
kernel/trampoline.S
kernel/main.c
kernel/proc.c
kernel/vm.c
kernel/trap.c
...Why does tags depend on $(OBJS)?
tags: $(OBJS)This says:
Before generating tags, build the kernel object files.Strictly speaking, etags only needs the source files, not the .o files.
So this dependency is not conceptually necessary for source indexing. It is probably there so that:
make tagsalso ensures the kernel source currently builds, or so generated/intermediate files are up to date before navigation.
But the actual tag generation command only reads:
kernel/*.S
kernel/*.cDoes this affect xv6 runtime?
No.
This has nothing to do with:
booting
linking
QEMU
filesystem image
syscalls
kernel executionIt is only for developer navigation.
Compact summary
| Part | Meaning |
|---|---|
tags | Make target for code navigation. |
$(OBJS) | Kernel object-file dependencies. |
etags | Tool that generates Emacs TAGS file. |
kernel/*.S kernel/*.c | Source files to index. |
| Runtime effect | None. Developer convenience only. |
Mental model:
kernel source files
↓ etags
TAGS file
↓
editor can jump to definitionsSo this rule is just “make it easier to browse xv6 source code.”
User Library
This whole block is about building xv6 user programs, not the kernel.
The kernel becomes:
kernel/kernelUser programs become files like:
user/_sh
user/_ls
user/_cat
user/_initThose _-prefixed binaries are later packed into fs.img by mkfs.
1. User-space mini library
ULIB = $U/ulib.o $U/usys.o $U/printf.o $U/umalloc.oSince:
U=userthis means:
ULIB = user/ulib.o user/usys.o user/printf.o user/umalloc.oThis is xv6’s tiny user-space runtime library.
| Object file | Source | Purpose |
|---|---|---|
user/ulib.o | user/ulib.c | Basic user helpers like string functions and wrappers. |
user/usys.o | generated from user/usys.S | Syscall stubs that execute ecall. |
user/printf.o | user/printf.c | User-space printf. |
user/umalloc.o | user/umalloc.c | User-space malloc/free. |
Why does every user program need this?
Because xv6 user programs do not link against normal libc.
So a program like ls needs xv6’s own tiny support code:
user/ls.o
+ user/ulib.o
+ user/usys.o
+ user/printf.o
+ user/umalloc.o
↓
user/_ls2. Generic rule for building user programs
_%: %.o $(ULIB) $U/user.ld
$(LD) $(LDFLAGS) -T $U/user.ld -o $@ $< $(ULIB)
$(OBJDUMP) -S $@ > $*.asm
$(OBJDUMP) -t $@ | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > $*.symThis is the main rule for linking user programs.
What does _%: %.o ... mean?
This is a pattern rule.
It says:
To build _something,
use something.o plus the user library.Examples:
user/_ls from user/ls.o
user/_cat from user/cat.o
user/_sh from user/sh.o
user/_init from user/init.oThe leading underscore is important.
On your host machine, there is already a real ls, cat, sh, etc. So xv6’s compiled user binaries are named:
user/_ls
user/_cat
user/_shThen mkfs puts them into the xv6 filesystem without the leading underscore, so inside xv6 they appear as:
ls
cat
shDependencies
_%: %.o $(ULIB) $U/user.ldTo build a user program, Make needs:
program object file
user mini-library objects
user linker scriptFor example:
user/_ls depends on:
user/ls.o
user/ulib.o
user/usys.o
user/printf.o
user/umalloc.o
user/user.ldLinking command
$(LD) $(LDFLAGS) -T $U/user.ld -o $@ $< $(ULIB)For user/_ls, this becomes roughly:
riscv64-unknown-elf-ld \
-z max-page-size=4096 \
-T user/user.ld \
-o user/_ls \
user/ls.o \
user/ulib.o user/usys.o user/printf.o user/umalloc.oMeaning:
Link ls.o with the xv6 user library
using user/user.ld
and produce user/_ls.What are $@, $<, and $*?
| Make variable | Meaning | Example for user/_ls |
|---|---|---|
$@ | Target being built | user/_ls |
$< | First dependency | user/ls.o |
$* | Stem matched by % | user/ls |
So:
-o $@means:
output to user/_lsand:
$<means:
use user/ls.o as the main objectGenerate user program disassembly
$(OBJDUMP) -S $@ > $*.asmFor user/_ls, this becomes:
riscv64-unknown-elf-objdump -S user/_ls > user/ls.asmIt creates a human-readable disassembly/source file.
So:
user/_ls
↓
user/ls.asmThis is useful if you want to inspect how ls.c compiled into RISC-V assembly.
Generate user program symbol file
$(OBJDUMP) -t $@ | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > $*.symFor user/_ls, this becomes:
riscv64-unknown-elf-objdump -t user/_ls \
| sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$/d' \
> user/ls.symIt creates:
user/ls.symThat file maps symbols to addresses.
Example conceptually:
00000000 main
000000a4 printf
00000120 writeAgain, this is for debugging/inspection.
3. Generate user/usys.S
$U/usys.S : $U/usys.pl
perl $U/usys.pl > $U/usys.SExpands to:
user/usys.S : user/usys.pl
perl user/usys.pl > user/usys.SThis says:
Generate user/usys.S from user/usys.pl.usys.pl is a Perl script that prints assembly code.
That generated assembly contains syscall wrappers.
For example, conceptually it creates functions like:
fork:
li a7, SYS_fork
ecall
ret
write:
li a7, SYS_write
ecall
ret
exit:
li a7, SYS_exit
ecall
retThe actual syscall calling convention is:
arguments go in registers like a0, a1, a2...
syscall number goes in a7
ecall enters the kernel
return value comes back in a0So if user code does:
write(1, "hi\n", 3);it calls the generated write stub in usys.S.
Then:
write() wrapper
↓
load syscall number into a7
↓
ecall
↓
kernel trap path4. Compile user/usys.S into user/usys.o
$U/usys.o : $U/usys.S
$(CC) $(CFLAGS) -c -o $U/usys.o $U/usys.SExpands to:
user/usys.o : user/usys.S
$(CC) $(CFLAGS) -c -o user/usys.o user/usys.SThis compiles/assembles the generated syscall stubs.
Flow:
user/usys.pl
↓ Perl generates
user/usys.S
↓ compiler/assembler
user/usys.o
↓ linked into every user programThis object is part of ULIB, so every xv6 user program gets syscall wrappers.
5. Special rule for forktest
$U/_forktest: $U/forktest.o $(ULIB)
# forktest has less library code linked in - needs to be small
# in order to be able to max out the proc table.
$(LD) $(LDFLAGS) -N -e main -Ttext 0 -o $U/_forktest $U/forktest.o $U/ulib.o $U/usys.o
$(OBJDUMP) -S $U/_forktest > $U/forktest.asmThis is a special case.
Normally, user programs link with all of ULIB:
ulib.o
usys.o
printf.o
umalloc.oBut forktest links only:
forktest.o
ulib.o
usys.oIt intentionally omits:
printf.o
umalloc.oWhy?
Because forktest is designed to stress the process table by creating as many processes as possible.
If the program is too large, each process consumes more memory. Then the test may run out of memory before it actually maxes out the process table.
So xv6 keeps forktest tiny.
Special linker flags for forktest
-N -e main -Ttext 0| Flag | Meaning |
|---|---|
-N | Make text/data more simply laid out and writable/readable; avoid page alignment overhead. |
-e main | Set entry point to main. |
-Ttext 0 | Start text/code at virtual address 0. |
This creates a very small/simple executable.
The output is:
user/_forktestThen:
$(OBJDUMP) -S $U/_forktest > $U/forktest.asmcreates:
user/forktest.asmComplete user build flow
For a normal user program like ls:
user/ls.c
↓ compile
user/ls.o
user/usys.pl
↓ Perl
user/usys.S
↓ compile
user/usys.o
user/ulib.c → user/ulib.o
user/printf.c → user/printf.o
user/umalloc.c → user/umalloc.o
user/ls.o + ULIB + user/user.ld
↓ link
user/_ls
↓ objdump
user/ls.asm
user/ls.symFor forktest:
user/forktest.o + user/ulib.o + user/usys.o
↓ special tiny link
user/_forktestBig idea
The kernel build and user build are parallel but separate:
kernel/*.c, kernel/*.S
↓
kernel/*.o
↓
kernel/kernel
user/*.c, generated user/usys.S
↓
user/*.o
↓
user/_init, user/_sh, user/_ls, ...
↓
mkfs packs them into fs.imgThis Makefile block is the part that turns xv6 user-space source code into RISC-V executables that the xv6 kernel can later load with exec.
mkfs
mkfs/mkfs: mkfs/mkfs.c $K/fs.h $K/param.h
gcc -Wno-unknown-attributes -I. -o mkfs/mkfs mkfs/mkfs.cThis rule builds the mkfs host tool.
mkfs is not part of the xv6 kernel and it is not an xv6 user program. It runs on your real machine during the build.
Its job is:
compiled xv6 user programs
↓
mkfs
↓
fs.imgThen QEMU gives fs.img to xv6 as its virtual disk.
Rule header
mkfs/mkfs: mkfs/mkfs.c $K/fs.h $K/param.hThis means:
To build mkfs/mkfs, Make needs:
mkfs/mkfs.c
kernel/fs.h
kernel/param.hSince:
K=kernelthis expands to:
mkfs/mkfs: mkfs/mkfs.c kernel/fs.h kernel/param.hSo if any of these change, mkfs/mkfs must be rebuilt.
Why does mkfs depend on kernel/fs.h?
Because mkfs must create a disk image in exactly the format the xv6 kernel understands.
kernel/fs.h defines the on-disk filesystem format:
block size
superblock layout
inode layout
directory entry format
bitmap math
filesystem constantsSo both sides must agree:
mkfs/mkfs.c
writes fs.img using fs.h layout
kernel/fs.c
reads fs.img using fs.h layoutIf fs.h changes, the format may change, so mkfs needs rebuilding.
Why does mkfs depend on kernel/param.h?
Because filesystem sizes and constants can depend on global xv6 parameters.
For example, things like filesystem size, log size, inode counts, or related constants may come from param.h.
So:
param.h changes
↓
filesystem constants may change
↓
mkfs must rebuildBuild command
gcc -Wno-unknown-attributes -I. -o mkfs/mkfs mkfs/mkfs.cNotice this uses plain:
gccnot:
$(CC)That is intentional.
mkfs runs on the host machine, so it must be compiled with the host compiler.
mkfs/mkfs.c
↓ host gcc
mkfs/mkfs
↓ runs on your laptop
creates fs.imgIf your laptop is x86-64, mkfs/mkfs is an x86-64 program.
If your laptop is ARM, it is an ARM program.
But the user binaries it packs into fs.img are RISC-V binaries.
Why not use the RISC-V compiler?
Because then mkfs/mkfs would become a RISC-V executable, and your host machine could not directly run it during the build.
Wrong mental model:
mkfs should run inside xv6Correct mental model:
mkfs runs before xv6 boots
mkfs creates the disk image xv6 will later readSo:
kernel/user programs → RISC-V compiler
mkfs tool → host compiler-Wno-unknown-attributes
-Wno-unknown-attributesSuppress warnings about compiler attributes the host compiler may not recognize.
This keeps mkfs building cleanly across different host compilers/toolchains.
-I.
-I.Add the current project root as an include path.
This lets mkfs/mkfs.c include headers like:
#include "kernel/fs.h"
#include "kernel/param.h"or similar project-relative headers.
-o mkfs/mkfs
-o mkfs/mkfsName the output executable:
mkfs/mkfsSo the build creates a host executable at that path.
.PRECIOUS
# Prevent deletion of intermediate files, e.g. cat.o, after first build, so
# that disk image changes after first build are persistent until clean.
.PRECIOUS: %.oThis is a Make behavior rule.
It tells Make:
Do not automatically delete .o intermediate files.What are intermediate files?
Sometimes Make builds a target through chained implicit rules.
For example:
user/cat.c
↓
user/cat.o
↓
user/_catIf Make considers user/cat.o only an intermediate file, it may delete it after building user/_cat.
.PRECIOUS: %.o says:
Keep .o files around.Why does xv6 care?
Because the disk image depends on compiled user programs.
The build path is:
user/cat.c
↓
user/cat.o
↓
user/_cat
↓
fs.imgIf intermediate files are deleted weirdly, Make’s dependency tracking can behave in surprising ways on later builds.
The comment says this helps make disk image changes after the first build persistent until make clean.
In plain English:
Keep object files around so incremental builds behave predictably.
Only remove them when the user explicitly runs make clean.Why .PRECIOUS specifically?
In GNU Make, .PRECIOUS has two effects:
1. Do not delete the target if the build is interrupted.
2. Do not delete it automatically if Make thinks it is intermediate.Here xv6 mainly cares about the second effect.
So:
.PRECIOUS: %.omeans:
All .o files are precious.
Do not auto-delete them.Compact summary
mkfs/mkfs rule:
builds the host-side filesystem image creator
uses plain gcc:
because mkfs runs on your real machine
depends on fs.h:
because mkfs must write the same filesystem format the kernel reads
depends on param.h:
because filesystem/kernel constants may affect the image layout
.PRECIOUS: %.o:
tells Make to keep object files around for stable incremental buildsBig picture:
RISC-V compiler:
kernel/kernel
user/_init
user/_sh
user/_cat
...
host gcc:
mkfs/mkfs
mkfs/mkfs:
reads user/_init, user/_sh, user/_cat, ...
writes fs.img
QEMU:
boots kernel/kernel
attaches fs.img as diskUser programs
This block defines which xv6 user programs get built and inserted into the filesystem image.
UPROGS=\
$U/_cat\
$U/_echo\
$U/_forktest\
$U/_grep\
$U/_init\
$U/_kill\
$U/_ln\
$U/_ls\
$U/_mkdir\
$U/_rm\
$U/_sh\
$U/_stressfs\
$U/_usertests\
$U/_grind\
$U/_wc\
$U/_zombie\
$U/_logstress\
$U/_forphan\
$U/_dorphan\Since:
U=userthis expands conceptually to:
user/_cat
user/_echo
user/_forktest
user/_grep
user/_init
...These are compiled xv6 user binaries.
Why the leading underscore?
On your host machine, names like cat, echo, grep, ls, mkdir, rm, and sh already exist as normal Unix/Linux commands.
So xv6 names its compiled user binaries with a leading underscore on the host:
user/_ls
user/_cat
user/_shBut inside xv6, they appear without the underscore:
ls
cat
shSo:
host filename: user/_ls
inside xv6 file: /lsmkfs handles this convention when it writes the files into fs.img.
What is UPROGS?
UPROGS means:
user programs to include in the xv6 filesystem imageThis list is not just “programs to compile.” It is specifically the list of programs that should exist inside the xv6 disk image.
So if you write a new user program:
user/hello.cyou usually add:
$U/_hello\to UPROGS.
Then the build can produce:
user/_helloand mkfs will put it into fs.img.
What each program is
| Program | Source file | Purpose |
|---|---|---|
$U/_cat | user/cat.c | Print file contents. |
$U/_echo | user/echo.c | Print command-line arguments. |
$U/_forktest | user/forktest.c | Stress-test fork and process table size. |
$U/_grep | user/grep.c | Search text for matching patterns. |
$U/_init | user/init.c | First user process started by the kernel. |
$U/_kill | user/kill.c | Request killing a process by PID. |
$U/_ln | user/ln.c | Create a hard link. |
$U/_ls | user/ls.c | List directory contents. |
$U/_mkdir | user/mkdir.c | Create a directory. |
$U/_rm | user/rm.c | Remove a file. |
$U/_sh | user/sh.c | xv6 shell. |
$U/_stressfs | user/stressfs.c | Stress-test filesystem writes. |
$U/_usertests | user/usertests.c | Large user/kernel behavior test suite. |
$U/_grind | user/grind.c | Stress-test process/filesystem/syscall interactions. |
$U/_wc | user/wc.c | Count lines, words, and bytes. |
$U/_zombie | user/zombie.c | Demonstrate zombie process behavior. |
$U/_logstress | user/logstress.c | Stress-test filesystem logging. |
$U/_forphan | user/forphan.c | Test orphaned process behavior. |
$U/_dorphan | user/dorphan.c | Helper/test related to orphaned processes. |
The fs.img rule
fs.img: mkfs/mkfs README $(UPROGS)
mkfs/mkfs fs.img README $(UPROGS)This rule creates the xv6 filesystem image.
Rule header
fs.img: mkfs/mkfs README $(UPROGS)This means:
To build fs.img, Make needs:
mkfs/mkfs
README
all user programs in UPROGSSo fs.img gets rebuilt if any of these change:
mkfs/mkfs changes
README changes
user/_cat changes
user/_sh changes
user/_init changes
...Build command
mkfs/mkfs fs.img README $(UPROGS)This runs the host-side mkfs tool.
Expanded conceptually:
mkfs/mkfs fs.img README user/_cat user/_echo user/_forktest user/_grep user/_init ...Meaning:
Create fs.img
and put README plus all UPROGS into it.So the output is:
fs.imgThat file is a raw xv6 filesystem image.
Build flow
For a user program like ls:
user/ls.c
↓ compile
user/ls.o
↓ link with ULIB
user/_ls
↓ mkfs
/ls inside fs.imgFor the whole image:
README
user/_cat
user/_echo
user/_forktest
user/_grep
user/_init
user/_kill
user/_ln
user/_ls
user/_mkdir
user/_rm
user/_sh
user/_stressfs
user/_usertests
user/_grind
user/_wc
user/_zombie
user/_logstress
user/_forphan
user/_dorphan
↓
mkfs/mkfs
↓
fs.imgWhy fs.img must include _init
This one is critical.
The kernel eventually tries to start the first user process:
/initThat file comes from:
user/_initinside fs.img.
If _init is missing from UPROGS, xv6 may boot the kernel but fail to start normal user space.
Runtime path:
kernel boots
↓
kernel initializes filesystem
↓
kernel tries to execute /init
↓
/init starts /sh
↓
you get a shellSo _init and _sh are especially important.
Important distinction
There are three kinds of files here:
user/*.c
source code for xv6 user programs
user/_*
compiled RISC-V user binaries on the host filesystem
fs.img
xv6 filesystem image containing those binaries as filesSo:
user/sh.c
↓
user/_sh
↓
fs.img contains /sh
↓
xv6 runs /shCompact summary
UPROGS:
list of compiled xv6 user binaries to include in fs.img
fs.img rule:
runs mkfs to create the xv6 disk image
mkfs/mkfs:
host program that writes fs.img
fs.img:
virtual disk used by xv6 inside QEMUBig picture:
user programs are built first
↓
mkfs packs them into fs.img
↓
QEMU gives fs.img to xv6 as a disk
↓
xv6 loads /init from fs.img
↓
init starts shNext step
This block has two parts:
-include kernel/*.d user/*.dand:
clean:
rm -f ...The first part helps incremental rebuilds. The second part deletes generated build artifacts.
-include kernel/*.d user/*.d
-include kernel/*.d user/*.dThis tells Make:
Include dependency files generated by the compiler,
but do not complain if they do not exist yet.Earlier, xv6 uses this compiler flag:
-MDThat makes GCC generate .d files alongside .o files.
For example:
kernel/proc.c
↓ compile
kernel/proc.o
kernel/proc.dThe .o file is the compiled object file.
The .d file records header dependencies, something like:
kernel/proc.o: kernel/proc.c kernel/types.h kernel/param.h kernel/proc.h kernel/riscv.hSo Make learns:
If proc.h changes, rebuild proc.o.
If riscv.h changes, rebuild proc.o.
If types.h changes, rebuild proc.o.Without .d files, Make might only know:
proc.o depends on proc.cand miss the fact that changing a header should trigger a rebuild.
Why the leading -?
This:
-includeis different from:
includeThe leading - means:
Try to include these files.
If they do not exist, ignore the error.That matters on the first build.
Before compilation, there may be no files like:
kernel/proc.d
kernel/vm.d
user/sh.dSo plain include could fail.
But -include says:
No dependency files yet? Fine. Continue.After the first build, the .d files exist and Make uses them for smarter incremental rebuilds.
Dependency file flow
kernel/proc.c
↓ compile with -MD
kernel/proc.o
kernel/proc.d
↓
Make includes proc.d next time
↓
Make knows which headers proc.o depends onSo this line is for correctness and convenience during repeated builds.
clean
clean:
rm -f *.tex *.dvi *.idx *.aux *.log *.ind *.ilg \
*/*.o */*.d */*.asm */*.sym \
$K/kernel fs.img \
mkfs/mkfs .gdbinit \
$U/usys.S \
$(UPROGS)This defines the make clean target.
When you run:
make cleanMake runs the rm -f ... command and removes generated files.
It resets the tree close to a fresh source state.
rm -f
rm -fmeans:
remove files if they exist;
do not error if they do not existSo make clean can be run repeatedly without failing just because some files are already gone.
Documentation artifacts
*.tex *.dvi *.idx *.aux *.log *.ind *.ilgThese are LaTeX/documentation build artifacts.
They are not central to the kernel itself.
They come from building docs/book-related material.
Object, dependency, assembly, and symbol files
*/*.o */*.d */*.asm */*.symThis removes generated files in subdirectories.
| Pattern | Removes | Meaning |
|---|---|---|
*/*.o | object files | Compiled C/assembly outputs. |
*/*.d | dependency files | Header dependency files from -MD. |
*/*.asm | disassembly files | Generated by objdump -S. |
*/*.sym | symbol files | Generated by objdump -t. |
Examples removed:
kernel/proc.o
kernel/proc.d
kernel/kernel.asm
kernel/kernel.sym
user/sh.o
user/sh.d
user/sh.asm
user/sh.symThese can all be regenerated.
Kernel and filesystem image
$K/kernel fs.imgSince:
K=kernelthis removes:
kernel/kernel
fs.imgMeaning:
| File | Meaning |
|---|---|
kernel/kernel | Final linked xv6 kernel. |
fs.img | xv6 filesystem disk image. |
After deleting these, the next make qemu must relink the kernel and recreate the disk image.
Host-side mkfs and GDB config
mkfs/mkfs .gdbinitThis removes:
| File | Meaning |
|---|---|
mkfs/mkfs | Host executable that creates fs.img. |
.gdbinit | Generated GDB configuration file. |
mkfs/mkfs is rebuilt from mkfs/mkfs.c.
.gdbinit is regenerated when using the GDB-related target.
Generated syscall assembly
$U/usys.SSince:
U=userthis removes:
user/usys.SThis file is generated from:
user/usys.plSo it is not source-of-truth. It can be regenerated.
Flow:
user/usys.pl
↓
user/usys.S
↓
user/usys.omake clean deletes the generated assembly so it can be recreated fresh.
User programs
$(UPROGS)This removes all compiled xv6 user binaries listed in UPROGS.
Examples:
user/_cat
user/_echo
user/_forktest
user/_grep
user/_init
user/_kill
user/_ln
user/_ls
user/_mkdir
user/_rm
user/_sh
...These are RISC-V executables built from user/*.c.
They are later packed into fs.img.
What remains after make clean?
The source files remain:
kernel/*.c
kernel/*.S
kernel/*.h
user/*.c
user/*.h
mkfs/mkfs.c
MakefileThe generated files disappear:
*.o
*.d
*.asm
*.sym
kernel/kernel
fs.img
mkfs/mkfs
user/usys.S
user/_*So after:
make cleanthe next build starts fresh.
Big picture
-include kernel/*.d user/*.dmeans:
Use compiler-generated dependency files
so header changes trigger correct rebuilds.cleanmeans:
Delete generated files:
object files,
dependency files,
debug listings,
kernel binary,
filesystem image,
mkfs executable,
generated syscall assembly,
compiled user programs.Together:
.d files make incremental builds smarter.
clean removes all generated state when you want a fresh rebuild.QEMU flags
This block is the QEMU run/debug section of the Makefile.
It handles:
normal boot: make qemu
debug boot: make qemu-gdb
GDB port setup: choose a unique port
QEMU options: define fake RISC-V hardware
version check: require a new enough QEMUGDB port generation
# try to generate a unique GDB port
GDBPORT = $(shell expr `id -u` % 5000 + 25000)This creates a semi-unique TCP port for GDB.
Breakdown:
id -ugets your numeric user ID.
Then:
user_id % 5000 + 25000creates a port somewhere between:
25000 and 29999Why?
Because if many users on the same machine run xv6 debugging, they should not all try to use the same GDB port.
Example:
user id = 1001
1001 % 5000 + 25000 = 26001So GDB would connect to port 26001.
QEMU GDB stub option
# QEMU's gdb stub command line changed in 0.11
QEMUGDB = $(shell if $(QEMU) -help | grep -q '^-gdb'; \
then echo "-gdb tcp::$(GDBPORT)"; \
else echo "-s -p $(GDBPORT)"; fi)QEMU has a built-in GDB stub.
That means QEMU can pause the virtual CPU and let GDB connect to it.
This block checks which QEMU command-line syntax is supported.
If QEMU supports:
-gdbthen use:
-gdb tcp::<port>Otherwise use the older style:
-s -p <port>So this is compatibility logic.
Conceptually:
Ask QEMU: do you support the modern -gdb option?
yes → use -gdb tcp::<port>
no → use old -s -p <port>CPU count
ifndef CPUS
CPUS := 3
endifThis means:
If CPUS was not already set, use 3.So by default xv6 runs with 3 simulated RISC-V CPUs/harts.
You can override it:
make qemu CPUS=1or:
make qemu CPUS=4Default:
CPUS = 3This matters because xv6 is a multiprocessor kernel. Locks, scheduling, interrupts, and per-CPU state are real concerns.
QEMU machine options
QEMUOPTS = -machine virt -bios none -kernel $K/kernel -m 128M -smp $(CPUS) -nographicThis defines the core QEMU command-line options.
Expanded conceptually:
qemu-system-riscv64 \
-machine virt \
-bios none \
-kernel kernel/kernel \
-m 128M \
-smp 3 \
-nographic| Option | Meaning |
|---|---|
-machine virt | Use QEMU’s generic RISC-V virtual machine. |
-bios none | Do not run firmware; jump directly to kernel. |
-kernel kernel/kernel | Load xv6 kernel binary. |
-m 128M | Give the virtual machine 128 MiB RAM. |
-smp $(CPUS) | Simulate multiple CPUs/harts. Default is 3. |
-nographic | No GUI; use terminal for serial console. |
So QEMU creates a fake machine like:
64-bit RISC-V virt machine
128 MiB RAM
3 harts by default
serial console in terminal
xv6 kernel loaded directlyVirtio compatibility option
QEMUOPTS += -global virtio-mmio.force-legacy=falseThis tells QEMU:
Use non-legacy virtio MMIO behavior.xv6 talks to the disk through a virtio block device. This option makes QEMU expose the device in the mode xv6 expects.
You do not need to deeply understand this at first. It is basically:
Make QEMU's virtio disk interface match xv6's driver.Attach fs.img as a virtual disk
QEMUOPTS += -drive file=fs.img,if=none,format=raw,id=x0This defines a raw disk backend.
Meaning:
Use fs.img as a disk image.
Do not automatically attach it to a bus yet.
Give it ID x0.Breakdown:
| Part | Meaning |
|---|---|
file=fs.img | Host file used as disk contents. |
if=none | Create backend only; do not auto-create device. |
format=raw | Treat file as raw disk bytes. |
id=x0 | Name this drive backend x0. |
Then:
QEMUOPTS += -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0attaches that backend as a virtio block device.
Meaning:
Create a virtio block device using drive x0.
Attach it to QEMU's virtio MMIO bus.Together:
fs.img host file
↓
QEMU drive backend x0
↓
virtio block device
↓
xv6 virtio_disk driver
↓
xv6 filesystemInside xv6, this looks like a disk.
On your host, it is just the file:
fs.imgNormal QEMU target
qemu: check-qemu-version $K/kernel fs.img
$(QEMU) $(QEMUOPTS)This defines:
make qemuDependencies:
check-qemu-version
kernel/kernel
fs.imgSo before QEMU starts, Make ensures:
QEMU version is new enough
kernel is built
filesystem image is builtThen it runs:
$(QEMU) $(QEMUOPTS)Conceptually:
build kernel
build fs.img
start fake RISC-V machine
load kernel
attach fs.img
boot xv6Generate .gdbinit
.gdbinit: .gdbinit.tmpl-riscv
sed "s/:1234/:$(GDBPORT)/" < $^ > $@This generates a local .gdbinit file from the template.
$^ means:
all dependenciesHere that is:
.gdbinit.tmpl-riscv$@ means:
target being builtHere that is:
.gdbinitSo the command is roughly:
sed "s/:1234/:26001/" < .gdbinit.tmpl-riscv > .gdbinitIt replaces the default GDB port 1234 with your generated GDBPORT.
Why?
Because QEMU’s GDB stub listens on that port, and GDB needs to connect to the same port.
Debug QEMU target
qemu-gdb: $K/kernel .gdbinit fs.img
@echo "*** Now run 'gdb' in another window." 1>&2
$(QEMU) $(QEMUOPTS) -S $(QEMUGDB)This defines:
make qemu-gdbDependencies:
kernel/kernel
.gdbinit
fs.imgThen it prints:
*** Now run 'gdb' in another window.The @ suppresses echoing the command itself.
1>&2 sends the message to stderr.
Then it starts QEMU with:
-Sand the GDB stub option.
What does -S mean?
Start QEMU with the CPU stopped.So QEMU loads the machine but does not begin executing instructions until GDB tells it to continue.
Debug flow:
Terminal 1:
make qemu-gdb
QEMU starts paused and waits for GDB.
Terminal 2:
gdb
GDB reads .gdbinit,
connects to QEMU,
sets breakpoints,
then you continue execution.This lets you debug from the very first instruction.
Print GDB port
print-gdbport:
@echo $(GDBPORT)This target just prints the port.
Example:
make print-gdbportOutput:
26001Useful if you need to manually connect GDB.
QEMU version detection
QEMU_VERSION := $(shell $(QEMU) --version | head -n 1 | sed -E 's/^QEMU emulator version ([0-9]+\.[0-9]+)\..*/\1/')This extracts QEMU’s major/minor version.
Example QEMU output:
QEMU emulator version 8.2.1The command extracts:
8.2Breakdown:
$(QEMU) --versionprints QEMU version.
head -n 1keeps first line.
sed -E 's/^QEMU emulator version ([0-9]+\.[0-9]+)\..*/\1/'extracts the major.minor part.
So:
QEMU emulator version 8.2.1becomes:
8.2QEMU version check
check-qemu-version:
@if [ "$(shell echo "$(QEMU_VERSION) >= $(MIN_QEMU_VERSION)" | bc)" -eq 0 ]; then \
echo "ERROR: Need qemu version >= $(MIN_QEMU_VERSION)"; \
exit 1; \
fiThis target checks:
Is QEMU_VERSION >= MIN_QEMU_VERSION?It uses bc, a command-line calculator, to compare versions.
If the result is 0, meaning false, it prints an error and exits.
Conceptually:
if QEMU is too old:
print error
stop build/run
else:
continueMIN_QEMU_VERSION is defined elsewhere in the Makefile, commonly as something like:
MIN_QEMU_VERSION = 7.2The point is:
xv6 expects certain QEMU behavior.
Old QEMU versions may not emulate the needed RISC-V/virtio features correctly.Full normal run flow
make qemu
↓
check QEMU version
↓
build kernel/kernel
↓
build fs.img
↓
start qemu-system-riscv64
↓
QEMU creates RISC-V virt machine
↓
loads kernel/kernel
↓
attaches fs.img as virtio disk
↓
xv6 bootsFull debug run flow
make qemu-gdb
↓
build kernel/kernel
↓
build fs.img
↓
generate .gdbinit with unique port
↓
start QEMU paused
↓
QEMU opens GDB stub port
↓
run gdb in another terminal
↓
GDB connects to QEMU
↓
debug xv6 from early bootCompact summary
| Makefile piece | Purpose |
|---|---|
GDBPORT | Pick a semi-unique TCP port for GDB. |
QEMUGDB | Choose correct QEMU GDB-stub syntax. |
CPUS := 3 | Default to 3 simulated RISC-V harts. |
QEMUOPTS | Define fake RISC-V machine hardware. |
-machine virt | QEMU generic RISC-V board. |
-bios none | Jump directly to xv6 kernel. |
-kernel kernel/kernel | Load xv6 kernel. |
-m 128M | Give xv6 128 MiB RAM. |
-smp $(CPUS) | Use multiple CPUs/harts. |
-nographic | Terminal-only console. |
-drive file=fs.img... | Use fs.img as disk backend. |
-device virtio-blk-device... | Attach disk as virtio block device. |
qemu | Build and boot xv6 normally. |
qemu-gdb | Build and boot xv6 paused for GDB. |
.gdbinit | Generated GDB connection config. |
check-qemu-version | Refuse to run with too-old QEMU. |
Hwardware
Here’s the table of the RISC-V hardware platform QEMU creates from these options:
QEMUOPTS = -machine virt -bios none -kernel $K/kernel -m 128M -smp $(CPUS) -nographic
QEMUOPTS += -global virtio-mmio.force-legacy=false
QEMUOPTS += -drive file=fs.img,if=none,format=raw,id=x0
QEMUOPTS += -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0| QEMU option | Hardware feature created | What xv6 sees | Why it matters |
|---|---|---|---|
qemu-system-riscv64 | 64-bit RISC-V machine | A 64-bit RISC-V CPU platform | xv6-riscv is compiled for rv64gc, so it needs a 64-bit RISC-V CPU. |
-machine virt | Generic QEMU RISC-V virtual board | A fake RISC-V computer with RAM, CPUs, UART, interrupt controller, timer, virtio devices | This is the “motherboard/platform” xv6 runs on. |
-bios none | No firmware/BIOS layer | QEMU jumps directly to the kernel | xv6 skips firmware/bootloader complexity. |
-kernel kernel/kernel | Kernel loaded into RAM | xv6 kernel placed at the expected boot address | This is why kernel.ld puts _entry at 0x80000000. |
-m 128M | 128 MiB physical RAM | RAM from roughly 0x80000000 to 0x88000000 | xv6’s allocator manages this physical memory. |
-smp $(CPUS) | Multiple RISC-V harts/cores | Default: 3 CPUs/harts | xv6 exercises locks, per-CPU state, and multiprocessor scheduling. |
-nographic | Serial console only, no GUI | Console I/O goes through terminal | xv6 shell appears directly in your terminal. |
-global virtio-mmio.force-legacy=false | Modern virtio MMIO mode | Virtio disk uses non-legacy MMIO behavior | Makes QEMU’s virtio device match xv6’s driver expectations. |
-drive file=fs.img,if=none,format=raw,id=x0 | Raw disk backend | fs.img becomes the backing storage for a disk | This is the host file containing xv6’s filesystem. |
-device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 | Virtio block device | xv6 sees a virtual disk device | virtio_disk.c uses this to read/write filesystem blocks. |
Hardware xv6 effectively gets
| Hardware component | Present? | xv6 file mostly responsible |
|---|---|---|
| 64-bit RISC-V CPU | Yes | riscv.h, entry.S, start.c |
| Multiple harts/cores | Yes, default 3 | proc.c, spinlock.c, start.c |
| Physical RAM | Yes, 128 MiB | kalloc.c, vm.c, memlayout.h |
| UART serial device | Yes | uart.c, console.c |
| External interrupt controller | Yes, PLIC | plic.c, trap.c |
| Timer interrupts | Yes | start.c, trap.c |
| Virtio MMIO bus | Yes | virtio.h, virtio_disk.c |
| Virtio block disk | Yes | virtio_disk.c, bio.c, fs.c |
| Graphical display | No | Not used |
| Keyboard device | Not directly | Terminal input comes through UART |
| Firmware/BIOS | No | QEMU jumps directly to xv6 |
Simplified memory/device map
Conceptually, QEMU gives xv6 a physical address space like this:
lower physical addresses
↓
device MMIO regions
UART
virtio disk
PLIC interrupt controller
timer-related registers
reserved/platform regions
0x80000000
↓
RAM starts here
xv6 kernel loaded here
kernel code/data/bss
free physical pages
user process memory
page tables
kernel stacks
0x88000000
end of RAM with -m 128MFull mental model
QEMU creates:
RISC-V virt machine
├── 3 RISC-V harts
├── 128 MiB RAM starting at 0x80000000
├── UART serial console
├── PLIC interrupt controller
├── timer interrupt support
├── virtio MMIO bus
└── virtio block device backed by fs.img
Then:
QEMU loads kernel/kernel
jumps to xv6 _entry
xv6 initializes hardware
xv6 reads fs.img as disk
xv6 runs /init