Kernel Synchronization
Fundamentals of Concurrent Access
Shared resources within the kernel require strict protection from concurrent access to prevent data overwriting and inconsistent states. In modern kernels, code can execute simultaneously on multiple processors due to symmetrical multiprocessing (SMP), or it can be involuntarily suspended and rescheduled due to kernel preemption. Without protection, code running on different processors or preempted threads can simultaneously access and manipulate the exact same shared data.
To understand how concurrent accesses cause damage to shared data, the concepts of critical regions and race conditions must be defined.
Critical Regions and Race Conditions
- Critical Region (Critical Section): A code path that accesses and manipulates shared data.
- Atomicity: The requirement that operations within a critical region complete without interruption, executing as if the entire region were one indivisible instruction.
- Race Condition: A system flaw where two or more threads of execution simultaneously execute within the same critical region, creating a “race” to modify the data.
- Synchronization: The programmatic prevention of unsafe concurrency and race conditions.
The Single Variable Increment The simplest critical region is the increment of a single variable, such as $i++$. At the processor level, this translates to three distinct instructions:
- Get the current value of
$i$and copy it into a register. - Add one to the value stored in the register.
- Write the new value back to memory.
If two threads execute this operation simultaneously when $i = 7$, both may read 7, increment to 8, and write back 8. The desired outcome of 9 is lost because the operations interleaved. Hardware architectures solve this specific race condition by providing atomic instructions that read, increment, and write back a single variable in one indivisible step.
While atomic instructions successfully protect single isolated variables, complex data structures spanning multiple operations require a broader synchronization mechanism.
Locking
A lock provides a mechanism to prevent access to a resource while another thread of execution is in the marked critical region.
- Advisory Nature: Locks are entirely voluntary programming constructs. Code must explicitly acquire the appropriate lock before accessing shared data and release it when finished.
- Mutual Exclusion: A lock can be held by at most one thread at a time.
- Wait States: If a thread attempts to acquire an unavailable lock, it must wait—either by busy-looping (spinning) or by sleeping—until the lock is released.
- Implementation: Locks are built upon architecture-specific atomic instructions, such as compare and exchange or test and set, ensuring that the act of acquiring the lock is itself free from race conditions.
Understanding when and where to deploy locks requires identifying the specific sources of concurrency that threaten kernel stability.
Causes of Concurrency
Concurrency manifests in two forms: pseudo-concurrency (tasks interleaving on a single processor) and true concurrency (tasks executing simultaneously on multiple processors). The kernel must synchronize against several distinct sources of both:
- Interrupts: Can occur asynchronously at almost any time, interrupting the currently executing code.
- Softirqs and tasklets: Can be raised or scheduled dynamically, interrupting currently executing code.
- Kernel preemption: The scheduler can preempt one task in the kernel to execute another.
- Sleeping and user-space synchronization: A task in the kernel can block (sleep), invoking the scheduler and transferring execution to a new process.
- Symmetrical Multiprocessing (SMP): Two or more processors can execute kernel code at the exact same time.
Kernel code is classified by its resilience against these concurrency sources:
- Interrupt-safe: Safe from concurrent access by an interrupt handler.
- SMP-safe: Safe from true concurrency on SMP machines.
- Preempt-safe: Safe from pseudo-concurrency caused by kernel preemption.
Knowing the exact causes of concurrency dictates which data actually requires locking and which data is naturally isolated.
Knowing What to Protect
Synchronization efforts must protect data, not code. Any data that can be accessed concurrently almost certainly needs protection.
- Local automatic variables: Do not require locking. They exist solely on the stack of the executing thread and cannot be accessed concurrently.
- Task-specific data: Data accessed by only a specific task does not require locking because a single process can only execute on one processor at a time.
- Global kernel data: Most global data structures require locks. If another thread of execution can see the data, it must be locked.
Configuration Options The Linux kernel tailors its locking mechanisms at compile time based on system capabilities:
CONFIG_SMP: Enables symmetrical multiprocessing support. If unset (uniprocessor machines), the overhead of physical spin locks is compiled away, leaving only markers that disable kernel preemption.CONFIG_PREEMPT: Enables kernel preemption.- Pessimistic Design: Kernel code should always be written for the most pessimistic scenario—an SMP system with kernel preemption enabled.
Designing a robust locking scheme must account for scenarios where acquiring the locks themselves halts system execution, a fatal condition known as a deadlock.
Deadlocks
A deadlock is a condition involving one or more threads and one or more resources, where each thread waits for a resource held by another, preventing any thread from making progress.
- Self-Deadlock: A thread attempts to acquire a lock it already holds. Because Linux does not provide recursive locks, the thread waits indefinitely for itself to release the lock.
- The Deadly Embrace (ABBA Deadlock): Thread 1 holds Lock A and waits for Lock B. Simultaneously, Thread 2 holds Lock B and waits for Lock A. Neither will ever release their initial lock.
Deadlock Prevention Rules
- Implement lock ordering: Nested locks must always be obtained in the exact same sequence across all functions. This unconditionally prevents the ABBA deadlock.
- Prevent starvation: Ensure critical regions always finish.
- Do not double-acquire: Never attempt to acquire the exact same lock twice.
- Design for simplicity: Complex locking schemes obscure data dependencies and invite deadlocks.
While avoiding deadlocks guarantees functional correctness, the efficiency of the locking scheme is dictated by lock contention and scalability.
Contention and Scalability
Lock contention describes a state where a lock is currently held and another thread is waiting to acquire it. High contention acts as a system bottleneck and occurs when a lock is frequently requested, held for long durations, or both.
Scalability measures how well a system expands to support an increase in processors, processes, or memory.
- Lock Granularity: Describes the amount of data a single lock protects.
- Coarse-Grained Locking: A single lock protects a massive amount of data (e.g., an entire subsystem). This scales poorly on large SMP machines due to high contention.
- Fine-Grained Locking: A lock protects a tiny amount of data (e.g., a single node in a linked list). This scales excellently on large SMP machines but introduces wasteful overhead and complexity on uniprocessor or dual-processor systems where contention is naturally low.
Proper synchronization demands identifying critical regions, avoiding deadlocks, and striking the exact balance between coarse and fine-grained lock scalability to implement the most optimal kernel synchronization primitives.