Linux Kernel Portability
An operating system’s portability is governed by the tradeoff between abstract, machine-independent interfaces and highly customized, architecture-specific optimizations. The Linux kernel strikes a balance by maintaining architecture-independent C code for core interfaces and delegating performance-critical, low-level routines to architecture-specific assembly (located in the arch/ hierarchy).
Hardware architecture diversity requires the kernel to strictly abstract physical constraints, beginning with the fundamental unit of processor data: the word.
Word Size and Data Types
A word is the amount of data a processor can process in a single operation. It dictates the size of general-purpose registers, the width of the memory bus, and the virtual memory address space.
- Standard C Type Sizes:
charis strictly byte.shortis strictly bits.intis typically bits, but this is not guaranteed by the C standard.longmatches the system word size (defined by theBITS_PER_LONGmacro).- Pointers strictly match the system word size.
- Operating System Data Models:
- LP64:
longand pointer types are -bit;intremains -bit. This is the standard data model for -bit Linux architectures. - ILP32:
int,long, and pointer types are all -bit. This is the standard data model for -bit Linux architectures. - LLP64:
intandlongare -bit; pointers are -bit (used by Windows, but not Linux).
- LP64:
- Development Rules:
- Never assume
sizeof(int) == sizeof(long). - Never assume pointer size equals
intsize.
- Never assume
Standard C types vary by architecture, necessitating specialized kernel types to enforce explicit sizes and restrict direct access to complex data structures.
Opaque and Explicit Types
The kernel uses specialized types to mask internal architecture variations and ensure cross-platform compatibility.
- Opaque Types:
- Hide internal structure and size formats to prevent improper casting or direct manipulation.
- Examples include
pid_t(process IDs),atomic_t(atomic integers),dev_t,uid_t, andgid_t. - Usage requires strict adherence to designated interfaces rather than standard C operators.
- Explicitly Sized Types:
- Ensure exact bit widths for hardware, networking, and binary file interactions.
- Kernel-space definitions:
s8,u8,s16,u16,s32,u32,s64,u64(e.g.,u32is an unsigned -bit integer). - User-space exported definitions: Prefixed with
__to protect namespaces (e.g.,__u32).
- Signedness of Characters:
- The
chartype is signed by default on most architectures (range to ), but unsigned by default on others like ARM (range to ). - Variables storing explicit numeric values must be explicitly declared as
signed charorunsigned char.
- The
Defining exact data sizes ensures structural integrity, but mapping these structures into physical memory introduces strict boundary constraints.
Data Alignment and Structure Padding
Data alignment refers to placing data at memory addresses that are multiples of the data’s size. A data type of size bytes must reside at an address where the least significant bits are zero.
- Alignment Rules:
- Base Types: Naturally aligned by the compiler. Accessing misaligned data triggers processor traps or severe performance degradation.
- Arrays: Inherit the alignment of their base type.
- Unions: Inherit the alignment of their largest included type.
- Structures: Aligned such that arrays of the structure maintain the natural alignment of every internal element.
- Structure Padding:
- The compiler injects padding bytes between structure members to satisfy alignment constraints.
- Padding increases the memory footprint calculated by
sizeof(). - ANSI C prohibits the compiler from automatically reordering structure members.
- Developers must manually reorder members (usually descending by size) to minimize padding waste, unless a specific hardware or binary layout is strictly required.
While alignment dictates the address boundaries of data, the internal arrangement of bytes within those boundaries relies entirely on processor byte ordering.
Byte Order (Endianness)
Byte ordering determines how multi-byte words are stored in physical memory.
- Big-Endian:
- The most significant byte is stored at the lowest memory address.
- Standard for most RISC architectures.
- Little-Endian:
- The least significant byte is stored at the lowest memory address.
- Standard for x86 architectures.
- Kernel Byte Order Macros:
- The kernel defines
__BIG_ENDIANor__LITTLE_ENDIANin<asm/byteorder.h>. - Conversion macros safely transition data between processor-native ordering and specific target orderings:
__cpu_to_be32(),__cpu_to_le32(),__be32_to_cpu(), and__le32_to_cpus(). - If the native byte order matches the target byte order, these macros compile to no-ops.
- The kernel defines
Beyond physical memory layout, architectural differences also mandate strict abstractions for spatial and temporal hardware configurations.
Time and Page Size
Hardcoded assumptions about system timing and memory paging immediately break code when ported across architectures.
- Time:
- Timer interrupt frequencies vary wildly (e.g., to ).
- Code must never hardcode interrupt frequencies.
- Time intervals must be scaled using the
HZmacro (e.g., half a second is represented asHZ/2).
- Page Size:
- Physical page sizes vary (e.g., on x86-32, on Alpha, on certain configurations).
- Memory sizes must be designated using the
PAGE_SIZEmacro. - Address shifts must be calculated using the
PAGE_SHIFTmacro.
Unifying these spatial and temporal abstractions across hardware requires mitigating unpredictable processor-level optimizations that execute concurrently.
Processor Ordering and Concurrency Assumptions
Writing portable code requires designing for the most pessimistic operational parameters across all supported architectures.
- Processor Ordering:
- Architectures utilize varying degrees of processor ordering; some execute instructions strictly sequentially, while others aggressively reorder loads and stores for performance.
- Dependencies must be enforced using memory barriers (
rmb(),wmb(),mb()) to ensure instruction commitment aligns with the code’s logical flow on all processors.
- Universal Concurrency Assumptions:
- SMP Safety: Code must always assume it runs on a Symmetrical Multiprocessing system and utilize appropriate spinlocks or mutexes.
- Preempt Safety: Code must assume kernel preemption is enabled and utilize preemption disabling macros when handling localized processor data.
- High Memory Safety: Code must assume the presence of high memory (physical memory not permanently mapped into the kernel address space) and dynamically map pages using
kmap()when required.