Instructions: Language of the Computer

Hardware Operations and Operands

The vocabulary of commands understood by a given computer architecture is known as its instruction set.

Design Principle 1: Simplicity favors regularity. Hardware operations demand a fixed number of operands, such as requiring exactly three operands for all arithmetic instructions (e.g., add a, b, c), to avoid the complexity of variable-operand hardware.
Design Principle 2: Smaller is faster. Processor data is held in a restricted number of fast, on-chip locations called registers. RISC-V architectures utilize 32 general-purpose registers, each 64 bits wide (a doubleword). Register x0 is hardwired to the constant value zero to provide useful operational variations.
Memory Operands: Complex data structures and arrays exceed register limits and must reside in main memory. Data transfer instructions (ld for load doubleword, sd for store doubleword) move data between memory and registers.
Addressing: Memory acts as a single-dimensional array accessed via an address calculated by adding a base register to a constant offset. RISC-V uses byte addressing in a little-endian format, meaning sequential doubleword addresses differ by 8, and the address points to the rightmost (least significant) byte.
Register Spilling: Compilers dynamically allocate the most frequently accessed variables to registers and transfer less frequently used data to memory, a process known as spilling registers.
Immediate Operands: Constants are frequently used in operations. By embedding these constant operands directly inside the instruction (e.g., addi), architectures bypass slow memory loads, adhering to the principle of making the common case fast.

The physical constraints of hardware that necessitate registers and memory also dictate how numerical data is continuously encoded within them.

Data Representation

Numeric values are stored as fixed-length 64-bit binary patterns in RISC-V doublewords.

Unsigned Numbers: Used for non-negative integers and memory addresses, representing ranges from $0$ to $2^{64} - 1$ .
Signed Numbers: Implemented using Two’s Complement notation to simplify hardware implementation.
- The most significant bit (MSB) serves as the sign bit: $0$ denotes positive, and $1$ denotes negative.
- Conversion between positive and negative involves inverting all bits and adding $1$ .
Hardware Overflow: Finite word sizes mean operations can exceed maximum or minimum thresholds. Overflow occurs when the MSB of a two’s complement result contradicts the infinite-precision sign.
Sign Extension: When loading narrower data types into 64-bit registers, signed loads (lb, lh, lw) replicate the sign bit across all newly filled upper bits to preserve the value’s mathematical magnitude. Unsigned loads (lbu, lhu, lwu) perform zero extension by filling upper bits with zeros.

These binary data representations are manipulated by instructions, which are themselves encoded as binary numbers.

Machine Language and Instruction Formats

Human-readable assembly instructions are translated into exact 32-bit binary sequences called machine language.

Design Principle 3: Good design demands good compromises. Maintaining all instructions at a uniform 32-bit length simplifies hardware but conflicts with the need to specify large constants or multiple registers. The compromise is to use multiple instruction formats with distinct field layouts.
R-type (Register): Encodes arithmetic and logical operations. Fields include a 7-bit opcode, 5-bit rd (destination), 3-bit funct3, 5-bit rs1 (source 1), 5-bit rs2 (source 2), and 7-bit funct7.
I-type (Immediate): Encodes loads and immediate arithmetic. The rs2 and funct7 fields are replaced by a continuous 12-bit signed immediate field.
S-type (Store): Encodes stores. To ensure rs1 and rs2 remain in identical bit positions across all formats, the 12-bit immediate field is split into two separate segments (imm[11:5] and imm[4:0]).
Stored-Program Concept: Instructions and data are stored together in memory as numbers. This unifies memory hardware and allows compilers to output programs that the hardware can execute interchangeably with data.

Binary-encoded instructions provide the exact bit-level manipulations necessary for executing core logic.

Logical Operations

Logical operations manipulate fields of bits or individual bits within a 64-bit doubleword to pack and unpack data.

Shifts: Move bits left or right, padding empty spaces with zeros (sll, slli, srl, srli). Shift right arithmetic (sra, srai) fills vacated leftmost bits with copies of the sign bit. Shifting left by $i$ bits mathematically acts as a multiplication by $2^{i}$ .
AND: Bit-by-bit operation (and, andi) yielding $1$ only if both source bits are $1$ . Used to mask bits, forcing unwanted bits to $0$ .
OR: Bit-by-bit operation (or, ori) yielding $1$ if either source bit is $1$ . Used to insert bit fields into a word.
XOR: Bit-by-bit exclusive OR (xor, xori) yielding $1$ if the bits differ. A logical NOT is achieved by applying XOR against a sequence of $1$ s.

Beyond arithmetic and bitwise logic, the processor must evaluate these values to alter the flow of execution.

Control Flow and Decision Making

Decision-making is implemented through conditional branches and unconditional jumps based on data evaluation.

Conditional Branches: Test conditions and branch to an address if true. Includes branch if equal (beq), branch if not equal (bne), branch if less than (blt), and branch if greater than or equal (bge).
Unsigned Branches: Using the u suffix (bltu, bgeu), operands are treated as unsigned integers where the MSB contributes to magnitude rather than a negative sign.
Basic Blocks: Compilers schedule code into basic blocks: sequences of instructions lacking internal branches and internal branch targets. Branches connect these blocks to construct loops and if-else structures.
Unconditional Jumps: The jump-and-link instruction (jal) branches to a target address while saving the address of the next sequential instruction into a register (usually x1). The jump-and-link register instruction (jalr) enables indirect jumps to an address stored in a register, necessary for switch branch tables and procedure returns.

Branching mechanisms are the foundational building blocks for implementing complex procedure calls and functions.

Procedure Execution and Memory Allocation

Procedures (functions) provide abstraction by executing a targeted task and returning without perturbing the calling program’s state.

Execution Steps:
1. Place parameters in accessible registers.
2. Transfer control via jal.
3. Acquire local storage resources.
4. Perform the computational task.
5. Place return values in target registers.
6. Return control to the caller via jalr using the saved return address.
Register Conventions: x10-x17 act as argument and return value registers. Temporary registers (x5-x7, x28-x31) are not preserved by the callee. Saved registers (x8-x9, x18-x27) must be preserved by the callee before use and restored before return.
The Stack: A Last-In-First-Out memory queue used to spill registers and accommodate local arrays. The stack pointer (sp or x2) delineates the top of the stack and is adjusted downward to allocate space (push) and upward to free space (pop).
Procedure Frame: The bounded section of the stack containing a specific procedure’s saved registers and local variables, sometimes anchored by a frame pointer (fp or x8) for stable memory referencing.
Memory Map: Memory is structured into the Text segment (machine code), the Static data segment (globals and constants, tracked by global pointer gp or x3), and the Dynamic data segment (the heap), which grows upward toward the downward-growing stack.

Data stored in memory often includes non-numeric structures like text characters, requiring distinct addressing approaches.

Characters, Strings, and Wide Addresses

Instruction sets must handle variable-width structures like 8-bit text characters and absolute memory addresses that surpass instruction format limits.

Character Processing: Text is represented in 8-bit ASCII or up to 32-bit Unicode. Processors manage characters and strings using isolated byte loads (lbu), byte stores (sb), halfword loads (lh, lhu), and halfword stores (sh).
Load Upper Immediate (lui): Accommodates 32-bit constants by utilizing the U-type format. It writes a 20-bit constant into bits 12–31 of a register and sets bits 0–11 to zeros. Combined with an addi instruction, it builds comprehensive 32-bit values.
PC-Relative Addressing: Utilized by conditional branches and unconditional jumps. The branch target is calculated as the sum of the current Program Counter (PC) and the constant offset encoded in the instruction, permitting position-independent code linking.
Addressing Mode Classifications:
1. Immediate addressing: The operand is the embedded constant.
2. Register addressing: The operand is located in a general-purpose register.
3. Base addressing: The memory location is the sum of a register and a constant offset.
4. PC-relative addressing: The memory location is the sum of the PC and a constant offset.

While individual instructions manage data effectively, modern architectures must safely coordinate execution across parallel instruction streams.

Synchronization and Parallelism

Parallel processing introduces challenges when independent tasks must coordinate access to shared memory.

Data Races: Occur when multiple processors attempt to read and write to the same memory location simultaneously, resulting in nondeterministic program behavior.
Atomic Exchange: Synchronization requires atomic memory operations—a read and a write bounded into a single, uninterruptible sequence.
Load-Reserved / Store-Conditional: RISC-V implements atomicity with paired instructions. The load-reserved (lr.d) reads a memory location. The subsequent store-conditional (sc.d) stores a new value to that exact location. If the location was modified by another processor between the two instructions, the sc.d fails, writes a non-zero error code to its target register, and prevents the memory mutation.

Before any parallel or sequential instructions execute, high-level code must be translated into these hardware-readable primitives.

Program Translation and Execution Hierarchy

The translation hierarchy dictates how human-readable source code is transformed into functioning hardware signals.

Compiler: Transforms high-level statements (e.g., C code) into symbolic assembly language, assigning variables to registers and organizing memory allocations.
Assembler: Translates symbolic assembly instructions into binary machine language. It abstracts complex hardware by offering pseudoinstructions (e.g., translating a symbolic mv into an addi using x0). It generates a symbol table connecting labels to their calculated memory addresses.
Linker: Stitches independently assembled object modules and pre-compiled library routines into a singular executable file. It patches internal and external address references, relocating modules to absolute memory mappings.
Loader: Transfers the finalized executable file from disk to main memory, initializes the stack pointer and general registers, and jumps to a startup routine that triggers the main program.
Java Translation: Java compilers translate code into hardware-independent Java bytecodes. A software interpreter called the Java Virtual Machine (JVM) executes these bytecodes on the host machine. For enhanced performance, Just In Time (JIT) compilers translate frequently executed bytecodes into native machine language during runtime.

Understanding the standard RISC-V translation and execution process provides a comparative baseline for analyzing alternative instruction set architectures.

Other Architectures and Extensions

The foundational principles of RISC-V are shared by other architectures, though divergent commercial paths have produced varying structural philosophies.

MIPS: Originating in the 1980s, MIPS shares the same core design philosophy as RISC-V, featuring 32-bit fixed-length instructions, a load-store architecture, and 32 general-purpose registers.
Intel x86 (CISC): The x86 architecture evolved over 40 years, adding new features iteratively. It operates as a Complex Instruction Set Computer, with instruction lengths varying drastically from 1 to 15 bytes. It relies on two-operand formats where one operand acts as both source and destination, and allows arithmetic operations directly on memory. To achieve modern performance, x86 hardware dynamically translates these complex instructions into simpler, RISC-like micro-operations before execution.
RISC-V Extensions: RISC-V separates its vocabulary into a minimal base integer architecture (I) and several modular extensions. Standard extensions include Integer Multiply/Divide (M), Atomic operations (A), Single-precision floating point (F), Double-precision floating point (D), and Compressed 16-bit instructions (C) designed to reduce executable sizes for embedded systems.

My Knowledge Base

Explorer

02 Instructions: Language of the Computer