Kernel Synchronization Methods

Atomic Operations

Atomic operations execute instructions without interruption, ensuring that the operation completes in its entirety or not at all. They form the foundation for all other synchronization methods.

  • Atomic Integer Operations:
    • Operate on the atomic_t (32-bit) and atomic64_t (64-bit) data types.
    • Using specialized types prevents the compiler from erroneously optimizing access and ensures atomic functions are not used on standard C integers.
    • Commonly used to implement lightweight counters and perform test-and-set operations, such as atomic_dec_and_test().
    • Provide atomicity (uninterrupted execution) but do not inherently provide ordering (the relative sequence of multiple instructions).
  • Atomic Bitwise Operations:
    • Operate on generic memory addresses via pointers, specifying a target bit number.
    • Guarantee that intermediate states are correctly realized, preventing simultaneous set and clear operations from failing.
    • Non-atomic variants (prefixed with __) are available for environments already protected by locks.

While atomic operations handle simple variables seamlessly, complex data structures spanning multiple operations require broader synchronization mechanisms like spin locks.

Spin Locks

A spin lock is a lightweight locking mechanism that can be held by at most one thread of execution.

  • Busy Waiting: If a thread attempts to acquire a contended lock, it loops (spins) while repeatedly checking for lock availability.
  • Duration Constraints: Because spinning wastes processor cycles, spin locks must be held for durations shorter than two context switches.
  • Non-Recursive: Linux spin locks are not recursive; attempting to acquire a lock already held by the same thread results in a self-deadlock.
  • Interrupt Context Usage:
    • Spin locks are safe for interrupt handlers because they do not sleep.
    • When used in an interrupt handler, the local processor’s interrupts must be disabled (via spin_lock_irqsave()) to prevent an interrupt from preempting the lock holder and double-acquiring the lock.
  • Bottom Half Safety: spin_lock_bh() disables all bottom halves while acquiring the lock to protect data shared between process context and bottom halves.

Spin locks strictly limit access to a single thread, but certain workloads naturally separate into read-only and write-only patterns that can be optimized with specialized variants.

Reader-Writer Spin Locks

Reader-writer spin locks split lock usage into shared and exclusive access patterns.

  • Concurrent Readers: One or more readers can concurrently hold the reader variant of the lock (read_lock()).
  • Exclusive Writers: The writer variant (write_lock()) can be held by at most one writer, requiring zero concurrent readers.
  • No Upgrades: A read lock cannot be upgraded to a write lock; attempting to acquire a write lock while holding a read lock results in a deadlock.
  • Writer Starvation: The implementation favors readers over writers. A continuous stream of readers will starve pending writers until all readers release the lock.

Spin locks, including their reader-writer variants, waste processor cycles while spinning, making them unsuitable for code that must sleep or locks held over long durations.

Semaphores

Semaphores are sleeping locks that place contending tasks onto a wait queue and yield the processor.

  • Performance Characteristics: They provide better processor utilization than spin locks during contention but incur significant context switching overhead.
  • Usage Constraints:
    • Ideal for locks held for long periods.
    • Cannot be used in interrupt context because they invoke the scheduler.
    • Do not disable kernel preemption, meaning code holding a semaphore can be preempted.
  • Usage Counts:
    • Binary Semaphore (Mutex): Initialized with a count of one, enforcing strict mutual exclusion.
    • Counting Semaphore: Initialized with a count greater than one, permitting multiple concurrent threads up to the specified limit.
  • Operations:
    • down_interruptible(): Decrements the count to acquire the lock; enters interruptible sleep if contended, allowing the task to wake upon receiving a signal.
    • up(): Increments the count to release the lock and wakes a waiting task.

Semaphores support complex counting and sleeping requirements, but environments demanding strict read/write boundaries benefit from targeted reader-writer semaphores.

Reader-Writer Semaphores

Reader-writer semaphores (struct rw_semaphore) provide sleeping locks divided by access patterns.

  • Access Rules: Multiple readers can hold the lock concurrently, but writers demand absolute exclusivity.
  • Uninterruptible Sleep: All reader-writer semaphores use uninterruptible sleep mechanisms.
  • Downgrading: The downgrade_write() function allows an acquired write lock to be atomically converted to a read lock.

While semaphores are highly flexible, their generic nature lacks strict constraint enforcement, leading the kernel to introduce a simpler, regimented sleeping lock.

Mutexes

The kernel struct mutex provides a simpler, more efficient sleeping lock dedicated exclusively to mutual exclusion.

  • Strict Constraints:
    • The usage count is rigidly fixed to one.
    • The context that locked the mutex is the only context permitted to unlock it.
    • Recursive locks and unlocks are forbidden.
    • Tasks cannot exit while holding a mutex.
    • Unusable in interrupt handlers or bottom halves, even via mutex_trylock().
  • Debugging: Enabling CONFIG_DEBUG_MUTEXES allows the kernel to programmatically verify and enforce these constraints.
  • Selection: Mutexes are strongly preferred over semaphores for all mutual exclusion needs unless a specific constraint prevents their use.

Mutexes provide robust mutual exclusion for shared data, but occasionally tasks merely need to signal to one another that a specific event has completed.

Completion Variables

Completion variables (struct completion) synchronize two tasks when one task must wait for another to complete an action.

  • Mechanism: One task calls wait_for_completion() to block until the event occurs.
  • Signaling: The executing task calls complete() to wake all tasks waiting on the completion variable.
  • Application: Frequently used dynamically within data structures to pause execution until structure initialization finishes.

Modern synchronization relies on precise, localized locks like mutexes and completions, but early multiprocessor support relied on a single, massive lock for the entire kernel.

BKL: The Big Kernel Lock

The Big Kernel Lock (BKL) is a deprecated, global spin lock implemented to ease Linux’s transition to fine-grained symmetrical multiprocessing.

  • Properties:
    • Allows tasks to safely sleep while held; the lock is transparently dropped upon unscheduling and reacquired upon rescheduling.
    • Operates as a recursive lock.
    • Restricted exclusively to process context.
  • Deprecation: Due to severe scalability bottlenecks, new code must never use the BKL.

As coarse locks like the BKL were phased out, highly specialized locks emerged for specific, high-contention access patterns, such as the sequential lock.

Sequential Locks

The sequential lock (seqlock_t) provides lightweight data protection utilizing a sequence counter.

  • Write Operations: Obtaining the write lock increments the sequence counter, making it odd. Releasing the lock increments it again, making it even.
  • Read Operations: Readers read the sequence number before and after accessing data. If the numbers match and are even, the read was uninterrupted by a write.
  • Writer Priority: Seq locks strictly favor writers; readers never block pending writers, but writers force readers to repeat their read loops until no writers hold the lock.
  • Ideal Workloads: Used when data has many readers, few writers, cannot be made atomic, and writers must never be starved (e.g., the jiffies_64 uptime variable).

Explicit locking protects data across multiple processors, but when data is strictly local to a single processor, it only requires protection from local task preemption.

Preemption Disabling

Kernel preemption allows a task to be involuntarily paused so a higher-priority task can run, creating pseudo-concurrency issues on uniprocessor systems.

  • Preemption Counters: The kernel tracks held locks and disabled regions via a preemption counter; preemption is disabled if the value is greater than zero.
  • Disabling/Enabling: preempt_disable() and preempt_enable() establish non-preemptive regions without requiring spin locks.
  • Per-Processor Data: get_cpu() safely retrieves the current processor number while simultaneously disabling kernel preemption to ensure the task is not moved to another processor mid-operation.

Preventing preemption and securing concurrent access ensures logical data safety, but hardware and compilers can still physically reorder instructions, violating expected memory operation sequences.

Ordering and Barriers

Compilers and modern processors optimize pipelines by dispatching and committing memory loads (reads) and stores (writes) out of order.

  • Memory Barriers: Enforce strict hardware instruction ordering.
    • rmb(): Read memory barrier; prevents loads from being reordered across the barrier.
    • wmb(): Write memory barrier; prevents stores from being reordered across the barrier.
    • mb(): General memory barrier; prevents both loads and stores from reordering across the barrier.
    • read_barrier_depends(): Enforces a read barrier strictly for loads that possess data dependencies.
  • Compiler Barriers: barrier() prevents the compiler from optimizing or rearranging loads and stores across the call, though it does not dictate hardware processor behavior.
  • SMP Variants: Macros like smp_mb() resolve to standard memory barriers on SMP kernels but degrade to simple compiler barriers on uniprocessor kernels.