Operating System Scheduling and Multiplexing

Multiplexing and Context Switching Mechanics

Multiplexing creates the illusion of private virtual CPUs by time-sharing physical hardware among multiple executing processes.
CPU switching executes in two primary scenarios: voluntarily when a process sleeps or waits for I/O, and involuntarily via periodic hardware timer interrupts.
The context switch sequence moves from a user process to a kernel thread, transfers to the current CPU’s scheduler thread, swaps to a new process’s kernel thread, and returns to user space.
Each CPU utilizes a dedicated scheduler thread with isolated saved registers and an independent stack.
- Dedicated scheduler stacks prevent multiple cores from simultaneously executing on the same yielded process stack.
The swtch function executes low-level context switching by saving and restoring 32 callee-saved RISC-V registers, which are organized within a struct context.
swtch saves the current CPU state, loads the new state, and captures the ra (return address) register rather than the program counter directly.
Execution unconditionally resumes at the instruction pointed to by the restored ra register, utilizing the newly restored stack pointer (sp).

Saving and restoring context establishes the fundamental hardware mechanism for swapping threads, which the scheduler directly leverages to cycle through available processes.

The Scheduler and Co-routines

The scheduler exists as a specialized per-CPU thread running a continuous loop that evaluates the process table for RUNNABLE processes.
Upon identifying a RUNNABLE process, the scheduler updates the per-CPU c->proc variable, modifies the process state to RUNNING, and invokes swtch to execute it.
Processes voluntarily yielding the CPU acquire their process lock (p->lock), update their internal state, and trigger sched.
The sched function invokes swtch to save the active process context and shift execution directly to the CPU’s scheduler context (cpu->scheduler).
sched and scheduler function as co-routines, mutually transferring execution control back and forth to sustain continuous system operation.
The process lock (p->lock) must remain held across the swtch boundary, actively passing lock ownership from the yielding thread to the newly executing thread.
- Holding p->lock during the switch protects vital invariants, guaranteeing another CPU cannot attempt to run the yielding process on its original stack before the context switch finalizes.
During initial process creation, allocproc explicitly sets the ra register to forkret, routing the first context switch to a function that releases p->lock prior to user space entry.

The continuous execution of these scheduler co-routines requires precise, real-time identification of the currently running process and the hardware CPU it occupies.

Per-CPU State and Execution Context

Hardware CPUs are dynamically multiplexed across a shared process pool, requiring dedicated structures to safely track localized per-CPU states.
A struct cpu is allocated for each processor, recording the currently active process, the scheduler thread’s saved registers, and the nesting depth of acquired spinlocks.
The mycpu function extracts a pointer to the current struct cpu by reading the hardware tp register, which stores the RISC-V hartid (hardware thread ID).
The tp register is initially populated in machine mode during boot sequence operations and is rigorously preserved across user-kernel transitions.
Interrupts must be disabled prior to invoking mycpu or myproc.
- Disabling interrupts prevents a timer interrupt from migrating the active thread to a different CPU mid-execution, which would critically invalidate the returned CPU reference.
The myproc function safely accesses the active struct proc pointer by disabling interrupts, calling mycpu, extracting c->proc, and immediately re-enabling interrupts.

Accurately tracking active CPU contexts enables complex thread coordination, such as conditionally pausing execution until a specific system state is achieved.

Conditional Synchronization: Sleep and Wakeup

Sequence coordination relies heavily on sleep and wakeup primitives, allowing threads to voluntarily yield the CPU while waiting for asynchronous events.
The sleep(chan, lock) architecture mandates a condition lock to eliminate the lost wake-up problem, wherein a thread sleeps waiting for a signal that was already transmitted.
When executed, sleep immediately acquires the process lock (p->lock) and subsequently drops the condition lock passed by the caller.
The process state transitions to SLEEPING, the wait channel is logged, and sched is triggered to yield the processor.
The wakeup(chan) function sequentially acquires p->lock for each evaluated process and transitions any process blocked on the matching chan to RUNNABLE.
Maintaining p->lock from before the condition is evaluated until after the process is marked SLEEPING ensures wakeup cannot bypass a blocking thread.
All sleep invocations must be encapsulated in a while loop that iteratively re-evaluates the condition, safely filtering out spurious wakeups generated when multiple threads awake on the same channel.

These foundational sleep and wakeup primitives directly support higher-level synchronization structures, enabling coordinated data transfers between concurrently executing processes.

Producer-Consumer Synchronization: Pipes

Pipes implement producer-consumer synchronization by combining sleep, wakeup, and a dedicated struct pipe.
The pipe structure employs an internal lock (pi->lock) to serialize access to read/write byte counters and the underlying data buffer.
Writing processes incrementally append data to the buffer and execute wakeup on the pi->nread channel to notify waiting consumer threads.
If the buffer reaches capacity, the writer sleeps on the pi->nwrite channel, automatically releasing pi->lock to allow readers to consume the data.
Reading processes extract data, increment read tracking counters, and issue a wakeup on pi->nwrite to unblock suspended writers.
Deploying isolated wait channels for readers (pi->nread) and writers (pi->nwrite) optimizes signal processing when multiple processes interface with a single pipe.

Similar conditional synchronization principles govern the lifecycle operations and inter-process coordination required during process termination.

Process Termination and Coordination: Wait, Exit, and Kill

Process termination relies on conditional synchronization to securely transition memory resources and state data between exiting children and blocking parents.
The exit function shifts a process to the ZOMBIE state, signals the parent process, reparents active orphans to init, and permanently yields the CPU.
The wait function acquires the global wait_lock—acting as the explicit condition lock—and scans the process table to clean up ZOMBIE children and extract exit codes.
- If no children have terminated, wait invokes sleep and yields, waiting for a child to issue a wakeup.
exit acquires both wait_lock and p->lock sequentially, guaranteeing a parent in wait cannot miss the wakeup signal or process the ZOMBIE state before the child yields the CPU.
The kill function does not unilaterally destroy a process; it sets the target’s p->killed flag and issues a wakeup if the target is currently sleeping.
System call loops verify the p->killed flag upon returning from sleep to trigger graceful termination, except during multi-step atomic I/O operations where aborting would corrupt system state.

Coordinating these process lifecycle events requires rigorous isolation of process state variables, governed by tightly controlled locking invariants.

Process Locking Mechanisms

The process lock (p->lock) actively shields high-level state structures from concurrent modifications across multiple hardware cores.
It protects essential fields including p->state, p->chan, p->killed, p->xstate, and p->pid.
p->lock prevents race conditions during process allocation, ensuring table slots are securely claimed during instantiation.
It strictly guarantees that only one scheduler thread can simultaneously transition a RUNNABLE process to execution.
The lock interacts continuously with wait_lock and condition locks to synchronize sleep, wakeup, and safe process destruction.
The p->parent field remains outside p->lock jurisdiction; it is synchronized entirely by the global wait_lock to serialize concurrent parent-child exits and prevent unhandled wakeups.

While the described scheduling and locking architectures are highly functional, they represent simplified baseline models compared to optimized production environments.

Advanced Scheduling and Real-World Equivalents

The baseline round-robin scheduling policy cycles processes uniformly, whereas advanced operating systems deploy priority-based scheduling to favor targeted execution classes.
Complex scheduling frameworks must actively manage priority inversion—where a low-priority process blocks a high-priority process by capturing a shared lock—and convoying.
Explicit sleep and wakeup models risk the “thundering herd” problem, wherein a single wakeup schedules multiple competing processes that subsequently block.
Modern synchronization architectures replace raw wait channels with explicit wait queues, Rendezvous points, or condition variables that implement targeted signal and broadcast primitives.
Production systems heavily utilize explicit semaphores, maintaining an internal integer count to completely bypass lost wakeups and mitigate thundering herd behaviors.
Asynchronous process termination in advanced environments requires robust signal handling (e.g., returning EINTR) to safely unwind stacks and interrupt sleeping processes.

My Knowledge Base

Explorer

07 Scheduling