Kernel Data Structures

Built-in generic data structures provide standardized, optimized primitives for organizing data within the operating system. Standardizing these structures prevents duplicated effort and code fragmentation.

Linked Lists

A linked list enables the storage and manipulation of a variable, dynamically created number of elements, known as nodes. Elements do not occupy contiguous memory regions; instead, each node contains a pointer to the next element.

  • List Classifications

    • Singly linked list: Nodes contain a single pointer to the next element.
    • Doubly linked list: Nodes contain pointers to both the next and previous elements.
    • Circular linked list: The final node’s next pointer points back to the first node, forming a ring.
  • Linux Implementation (struct list_head)

    • Instead of embedding data within a linked list node, Linux embeds the linked list node within the data structure.
    • The list_head structure contains only two pointers: next and prev.
    • The container_of() macro uses the C ABI’s fixed compile-time offsets to dynamically calculate and return the parent structure containing a given list_head node.
    • The list_entry() macro wraps container_of() specifically to retrieve the data structure embedding the list_head.
  • List Initialization

    • Dynamic: INIT_LIST_HEAD initializes a dynamically allocated list at runtime.
    • Static: LIST_HEAD_INIT initializes a statically created list at compile time.
    • Head Pointers: The LIST_HEAD macro defines and initializes a canonical head pointer for the list.
  • List Manipulation

    • All core list manipulation functions execute in time, regardless of list size.
    • list_add(): Inserts a new node immediately after the specified head node (useful for stacks).
    • list_add_tail(): Inserts a new node immediately before the head node (useful for queues).
    • list_del(): Removes a node from the list by updating the next and prev pointers of adjacent nodes, without freeing the node’s memory.
    • list_del_init(): Removes a node and reinitializes its list_head structure for reuse.
    • list_move() / list_move_tail(): Removes a node from one list and adds it to the front or tail of another.
    • list_splice() / list_splice_init(): Merges two unconnected lists together.
  • List Traversal

    • Iterating over an entire list is an operation.
    • list_for_each_entry(): The standard macro for iterating over a list, providing a pointer to the parent data structure on each iteration.
    • list_for_each_entry_reverse(): Iterates over the list backward by following prev pointers, useful for last-in/first-out (LIFO) semantics.
    • list_for_each_entry_safe(): Utilizes temporary storage for the next pointer, making it safe to remove the current entry during iteration.

While linked lists excel at arbitrary traversal and dynamic resizing, managing data flow strictly between producers and consumers requires a constrained, sequential data structure.

Queues

Queues implement first-in, first-out (FIFO) semantics, serving as the standard mechanism for the producer/consumer programming pattern. Producers push data onto the queue, and consumers pull data off in the exact order it was enqueued.

  • Linux Implementation (kfifo)

    • The kfifo object manages the queue using two offsets: in and out.
    • The in offset tracks the location for the next enqueue operation; the out offset tracks the location for the next dequeue operation.
    • An empty queue occurs when the in offset equals the out offset.
    • A full queue occurs when the in offset reaches the total length of the queue buffer.
  • Queue Initialization

    • Dynamic: kfifo_alloc() allocates and initializes a queue of a specified size using a provided memory allocation mask (e.g., GFP_KERNEL). The size must be a power of two.
    • Manual Buffer: kfifo_init() initializes a queue using a pre-allocated buffer.
    • Static: DECLARE_KFIFO() and INIT_KFIFO() create and initialize a static queue.
  • Queue Operations

    • kfifo_in(): Copies data into the queue starting at the in offset and increments the offset by the amount of data added.
    • kfifo_out(): Copies data out of the queue starting at the out offset, increments the offset, and removes the data’s accessibility from the queue.
    • kfifo_out_peek(): Reads data from the queue without incrementing the out offset, leaving the data available for subsequent dequeue operations.
    • kfifo_size(): Returns the total buffer size in bytes.
    • kfifo_len(): Returns the number of bytes currently enqueued.
    • kfifo_avail(): Returns the number of bytes available to write.
    • kfifo_is_empty() / kfifo_is_full(): Boolean checks for queue state.
    • kfifo_reset() / kfifo_free(): Empties the queue contents or completely deallocates the dynamic kfifo.

Queues efficiently handle sequential data streams, but mapping unique identifiers to specific data structures necessitates a different, associative approach.

Maps

A map, or associative array, is a collection of unique keys where each key maps to a specific value. Maps must support adding, removing, and looking up values based on keys.

  • Linux Implementation (idr)

    • The idr data structure is a specialized map used exclusively for mapping unique identification numbers (UIDs) to pointers.
    • It is initialized using idr_init().
  • UID Allocation

    • UID allocation is split into two steps to allow memory resizing without holding a lock.
    • Step 1: idr_pre_get() resizes the backing tree if necessary to fulfill a new UID allocation. It returns on success and on error.
    • Step 2: idr_get_new() actually allocates the new UID and maps it to the provided pointer.
    • idr_get_new_above() functions exactly like idr_get_new() but guarantees the returned UID is equal to or greater than a specified minimum value, ensuring UIDs are never reused during a system’s uptime.
  • Map Operations

    • idr_find(): Returns the pointer mapped to the provided UID, or NULL if not found. Mapping NULL to a UID is discouraged as it prevents distinguishing between a successful NULL lookup and a failure.
    • idr_remove(): Removes a specific UID from the map.
    • idr_destroy(): Deallocates unused memory associated with the idr, but does not free memory currently in use by allocated UIDs.
    • idr_remove_all(): Forces the removal of all UIDs, which should be called before idr_destroy() to prevent memory leaks.

While the idr structure manages exact UID mappings, storing large datasets that require rapid, ordered lookups calls for hierarchical, tree-based data structures.

Binary Trees

A tree is an acyclic, connected, directed graph where nodes have zero or more outgoing edges and at most one incoming edge. A binary tree restricts nodes to a maximum of two outgoing edges (left and right children).

  • Binary Search Trees (BST)

    • A BST imposes a strict ordering: the left subtree contains only nodes smaller than the root, and the right subtree contains only nodes larger than the root.
    • All subtrees must also be valid BSTs.
    • This ordering ensures logarithmic time searching and linear time in-order traversal.
  • Self-Balancing Binary Search Trees (Red-Black Trees)

    • A balanced BST ensures the depth of all leaves differs by at most one.
    • A self-balancing BST actively maintains a semi-balanced state during operations.
    • A Red-Black Tree enforces six properties to remain semi-balanced:
      1. Nodes are either red or black.
      2. Leaf nodes are black.
      3. Leaf nodes contain no data.
      4. Non-leaf nodes have two children.
      5. Red nodes must have black children.
      6. Every path from a node to its leaves contains the identical number of black nodes.
    • These properties guarantee the deepest leaf is never more than twice the depth of the shallowest leaf.
  • Linux Implementation (rbtrees)

    • The Linux red-black tree is represented by struct rb_root (the tree root) and struct rb_node (the individual nodes).
    • Linux does not provide generic search or insert functions for rbtrees because C lacks generic programming constructs. Developers must write custom search and insert logic tailored to their data comparisons.
    • Custom insertion routines must locate the correct leaf node, insert the new node using rb_link_node(), and immediately call rb_insert_color() to rebalance the tree and maintain Red-Black properties.

Selecting the appropriate structure from linked lists, queues, maps, and rbtrees requires analyzing the operational needs and scaling characteristics of the specific workload.

Data Structure Selection & Algorithmic Complexity

Choosing the correct generic data structure relies on the primary access patterns of the subsystem. The scalability of these structures is expressed quantitatively through algorithmic complexity.

  • Selection Guidelines

    • Linked Lists: Use when the primary access method is iterating over all data. Iteration is inherently regardless of the data structure, so the simplest implementation is favored.
    • Queues: Use when data fits the producer/consumer pattern, requiring strict FIFO semantics and potentially fixed-size buffers.
    • Maps: Use when mapping a UID to an object is required, taking advantage of the idr structure’s automated UID allocation.
    • Red-Black Trees: Use when storing large amounts of data requiring rapid () lookups and efficient ordered traversal. They are too memory- and code-heavy for simple workloads without time-critical lookups.
  • Algorithmic Complexity

    • Big-O Notation: Represents the upper bound of an algorithm’s growth. Formally, is if there exists a constant such that for all sufficiently large inputs.
    • Big-Theta Notation: Represents the least upper bound, meaning serves as both an upper and lower bound for .
    • Common Time Complexities:
      • : Constant time (perfect scalability; execution time is independent of input size, such as list_add()).
      • : Logarithmic time (e.g., Red-Black Tree lookups).
      • : Linear time (e.g., Linked List traversal).
      • : Quadratic time.
      • : Exponential time.

Understanding and applying these complexities ensures that the selected data structure scales safely under expanding inputs across users, processes, and architectures.