RV32A: Atomic Instructions
Synchronization Primitives Architecture
The RV32A extension provides atomic instructions to support synchronization in multiprocessor environments. The architecture partitions atomic operations into two distinct mechanisms: Load Reserved / Store Conditional (LR/SC) and Atomic Memory Operations (AMO). All RV32A instructions require naturally aligned memory addresses, as hardware cannot efficiently guarantee atomicity across cache-line boundaries.
Selecting between these two mechanisms depends on whether the system requires universal synchronization logic or high-performance multi-node scalability.

Load Reserved and Store Conditional (LR/SC)
LR/SC implements an atomic operation across two separate, linked instructions.
- Load Reserved (
lr.w): Reads a 32-bit word from memory, writes it to a destination register, and registers a load reservation on that specific memory address. - Store Conditional (
sc.w): Attempts to store a word to a target address, provided a valid load reservation currently exists for that address.- Writes to the destination register if the store is successful.
- Writes a nonzero error code to the destination register if the store fails.
This split-instruction design synthesizes universal synchronization primitives, such as compare-and-swap (CAS). A native CAS instruction inherently requires three source registers: a memory address, an expected value, and a new swap value. LR/SC avoids three-operand instructions, preserving the simplicity of the standard integer datapath and instruction formats while enabling complex synchronization routines.
While LR/SC maintains datapath simplicity, highly parallel systems benefit from instructions that execute entire read-modify-write operations within a single bus transaction.
Atomic Memory Operations (AMOs)
AMOs execute a read-modify-write sequence as a single, indivisible hardware operation. The architecture guarantees that no interrupts or remote processor modifications can occur between the memory read and the memory write.
- Execution Flow:
- Reads the current value at the target memory address.
- Performs a designated ALU operation between the retrieved memory value and a source register.
- Writes the modified value back to the memory address.
- Stores the original memory value into the destination register.
- Supported AMO Instructions:
amoadd.w: Addamoand.w,amoor.w,amoxor.w: Bitwise AND, OR, XORamoswap.w: Swapamomin.w,amomax.w: Minimum, Maximum (signed)amominu.w,amomaxu.w: Minimum, Maximum (unsigned)
AMOs scale more efficiently in large multiprocessor systems compared to LR/SC polling loops. They optimize reduction operations and streamline I/O device communication by performing atomic read-writes in a single bus transaction.
Both LR/SC and AMOs dictate how individual memory locations update atomically, but controlling the visibility of these updates across multiple executing threads requires strict memory ordering constraints.
Memory Consistency and Ordering
RISC-V utilizes a relaxed memory consistency model, meaning harts (hardware threads) may observe memory accesses out of program order. Strict ordering for critical sections is enforced via two dedicated annotation bits present in the encoding of all RV32A instructions:
- Acquire (
aq): When , the atomic operation is guaranteed to be visible to other threads strictly in-order with all subsequent memory accesses. - Release (
rl): When , the atomic operation is guaranteed to be visible to other threads strictly in-order with all previous memory accesses.
These bits manage structural lock behavior across the system. A lock acquisition operation sets the aq bit to ensure the lock is held before guarded data is read, while a lock release operation sets the rl bit to ensure all data modifications are visible before the lock is relinquished.