RV32F and RV32D

Architectural Overview

  • The RV32F and RV32D extensions provide 32-bit (single-precision) and 64-bit (double-precision) floating-point capabilities.
  • These extensions strictly adhere to the IEEE 754-2008 floating-point standard.
  • Implementing floating-point capabilities requires dedicated hardware structures to manage computation state without polluting the primary integer datapath.

Floating-Point Registers and State

  • The architecture introduces 32 independent floating-point registers, designated f0 through f31.
  • Doubling register capacity and bandwidth via a separate register file improves performance without increasing the size of the register specifier in the instruction format.
  • The f0 register is alterable and holds standard data, unlike the integer x0 register which is hardwired to zero.
  • When both extensions are implemented, single-precision operations utilize only the lower 32 bits of the 64-bit f registers.
  • The floating-point control and status register (fcsr) maintains the global arithmetic configuration and records boundary conditions.
    • Rounding Modes (frm): Defines the mathematical rounding behavior.
      • Round to nearest, ties to even (RNE) serves as the most accurate and common default.
      • Alternative modes include round towards zero (rtz), round down towards (rdn), round up towards (rup), and round to nearest, ties to max magnitude (rmm).
      • Static rounding allows individual instructions to override the dynamic fcsr rounding mode via an optional argument, optimizing sequences that require a specific rounding behavior.
    • Accrued Exception Flags (fflags): Five flags indicate runtime faults: Invalid Operation (NV), Divide by Zero (DZ), Overflow (OF), Underflow (UF), and Inexact (NX).
  • To utilize these dedicated registers, the architecture defines specific mechanisms to transfer data directly to and from memory and the integer datapath.

Memory Access and Register Transfers

  • Floating-point load and store instructions utilize the identical base addressing mode as integer operations, calculating the effective address by adding a 12-bit sign-extended immediate to a base register.
    • Loads: flw retrieves a 32-bit word; fld retrieves a 64-bit doubleword.
    • Stores: fsw writes a 32-bit word; fsd writes a 64-bit doubleword.
  • Instructions exist to transfer data directly between integer (x) and floating-point (f) registers without utilizing memory as an intermediary.
    • fmv.x.w copies a 32-bit single-precision value from an f register into an x register.
    • fmv.w.x copies a 32-bit integer value from an x register into an f register.
  • Once data is successfully loaded or transferred into the f registers, it is manipulated via specialized mathematical instructions.

Arithmetic Operations and Fused Computations

  • Standard arithmetic instructions support both precision levels: addition (fadd.s/d), subtraction (fsub.s/d), multiplication (fmul.s/d), division (fdiv.s/d), and square root (fsqrt.s/d).
  • Unlike integer multiplication, the size of a floating-point product is mathematically identical to the size of its source operands.
  • Minimum (fmin.s/d) and maximum (fmax.s/d) operations isolate the smaller or larger of two source operands and write the result directly to the destination register, eliminating the need for branching.
  • Fused Multiply-Add: Operations that require sequential multiplication and addition (or subtraction) utilize fused instructions for performance and precision gains.
    • Variants include fmadd (multiply then add), fmsub (multiply then subtract), fnmadd (negate product then add), and fnmsub (negate product then subtract).
    • These instructions utilize the specialized R4 instruction format to accommodate four register specifiers (three sources, one destination).
    • Fused instructions execute faster and maintain higher accuracy by performing a single rounding operation at the end of the full calculation, rather than rounding after both the multiplication and the addition.
  • Beyond raw mathematical transformation, calculation results must frequently be evaluated to determine subsequent program execution paths.

Comparisons and Control Flow

  • The RV32F and RV32D extensions omit dedicated floating-point branch instructions to maintain datapath simplicity.
  • Instead, comparison instructions evaluate two floating-point registers and output a boolean 1 or 0 into a standard integer destination register.
    • Available comparisons evaluate equality (feq.s/d), strict less-than (flt.s/d), and less-than-or-equal-to (fle.s/d) conditions.
  • Standard integer branch instructions (introduced in the base ISA) then evaluate this integer register to execute the conditional jump.
  • Floating-point data frequently requires structural modification or type casting before it can be effectively evaluated or stored in other data structures.

Type Conversion, Sign Injection, and Classification

  • Data Conversion: The architecture provides exhaustive conversion capabilities between 32-bit signed/unsigned integers, 32-bit floating-point data, and 64-bit floating-point data.
    • Conversions utilize the fcvt family of instructions, explicitly defining both the source and target data types (e.g., fcvt.s.w converts a signed integer word to single-precision float).
  • Sign Injection: The fsgnj family of instructions copies a complete floating-point value from a source register while allowing independent manipulation of its sign bit.
    • fsgnj.s/d applies the exact sign bit from a second source register.
    • fsgnjn.s/d applies the inverted sign bit of a second source register.
    • fsgnjx.s/d applies the XOR result of the sign bits from both source registers.
    • These hardware primitives facilitate vital pseudoinstructions without adding opcodes: absolute value (fabs uses fsgnjx because and ), negation (fneg utilizes fsgnjn), and raw register moves (fmv utilizes fsgnj).
  • Data Classification: The fclass.s/d instruction analyzes a floating-point operand and maps it to one of 10 standardized IEEE 754-2008 states.
    • The instruction outputs a 10-bit one-hot mask into an integer destination register.
    • Classifiable states include: , negative normal, negative subnormal, , , positive subnormal, positive normal, , signaling NaN, and quiet NaN.