OpenTitan Big Number Accelerator (OTBN) Technical Specification

Overview

This document specifies functionality of the OpenTitan Big Number Accelerator, or OTBN. OTBN is a coprocessor for asymmetric cryptographic operations like RSA or Elliptic Curve Cryptography (ECC).

This module conforms to the Comportable guideline for peripheral functionality. See that document for integration overview within the broader top level system.

Features

  • Processor optimized for wide integer arithmetic
  • 32b wide control path with 32 32b wide registers
  • 256b wide data path with 32 256b wide registers
  • Full control-flow support with conditional branch and unconditional jump instructions, hardware loops, and hardware-managed call/return stacks.
  • Reduced, security-focused instruction set architecture for easier verification and the prevention of data leaks.
  • Built-in access to random numbers.

Description

OTBN is a processor, specialized for the execution of security-sensitive asymmetric (public-key) cryptography code, such as RSA or ECC. Such algorithms are dominated by wide integer arithmetic, which are supported by OTBN’s 256b wide data path, registers, and instructions which operate these wide data words. On the other hand, the control flow is clearly separated from the data, and reduced to a minimum to avoid data leakage.

The data OTBN processes is security-sensitive, and the processor design centers around that. The design is kept as simple as possible to reduce the attack surface and aid verification and testing. For example, no interrupts or exceptions are included in the design, and all instructions are designed to be executable within a single cycle.

OTBN is designed as a self-contained co-processor with its own instruction and data memory, which is accessible as a bus device.

Compatibility

OTBN is not designed to be compatible with other cryptographic accelerators. It received some inspiration from assembly code available from the Chromium EC project, which has been formally verified within the Fiat Crypto project.

Instruction Set

OTBN is a processor with a custom instruction set. The full ISA description can be found in our ISA manual. The instruction set is split into two groups:

  • The base instruction subset operates on the 32b General Purpose Registers (GPRs). Its instructions are used for the control flow of a OTBN application. The base instructions are inspired by RISC-V’s RV32I instruction set, but not compatible with it.
  • The big number instruction subset operates on 256b Wide Data Registers (WDRs). Its instructions are used for data processing.

Processor State

General Purpose Registers (GPRs)

OTBN has 32 General Purpose Registers (GPRs), each of which is 32b wide. The GPRs are defined in line with RV32I and are mainly used for control flow. They are accessed through the base instruction subset. GPRs aren’t used by the main data path; this operates on the Wide Data Registers, a separate register file, controlled by the big number instructions.

x0 Zero register. Reads as 0; writes are ignored.
x1

Access to the call stack

x2 ... x31 General purpose registers

Note: Currently, OTBN has no “standard calling convention,” and GPRs other than x0 and x1 can be used for any purpose. If a calling convention is needed at some point, it is expected to be aligned with the RISC-V standard calling conventions, and the roles assigned to registers in that convention. Even without a agreed-on calling convention, software authors are encouraged to follow the RISC-V calling convention where it makes sense. For example, good choices for temporary registers are x6, x7, x28, x29, x30, and x31.

Call Stack

OTBN has an in-built call stack which is accessed through the x1 GPR. This is intended to be used as a return address stack, containing return addresses for the current stack of function calls. See the documentation for JAL and JALR for a description of how to use it for this purpose.

The call stack has a maximum depth of 8 elements. Each instruction that reads from x1 pops a single element from the stack. Each instruction that writes to x1 pushes a single element onto the stack. An instruction that reads from an empty stack or writes to a full stack causes a CALL_STACK software error.

A single instruction can both read and write to the stack. In this case, the read is ordered before the write. Providing the stack has at least one element, this is allowed, even if the stack is full.

Control and Status Registers (CSRs)

Control and Status Registers (CSRs) are 32b wide registers used for “special” purposes, as detailed in their description; they are not related to the GPRs. CSRs can be accessed through dedicated instructions, CSRRS and CSRRW . Writes to read-only (RO) registers are ignored; they do not signal an error. All read-write (RW) CSRs are set to 0 when OTBN starts an operation (when 1 is written to CMD.start).

Number Access Name Description
0x7C0 RW FG0 Wide arithmetic flag group 0. This CSR provides access to flag group 0 used by wide integer arithmetic. FLAGS, FG0 and FG1 provide different views on the same underlying bits.
BitDescription
0Carry of Flag Group 0
1MSb of Flag Group 0
2LSb of Flag Group 0
3Zero of Flag Group 0
0x7C1 RW FG1 Wide arithmetic flag group 1. This CSR provides access to flag group 1 used by wide integer arithmetic. FLAGS, FG0 and FG1 provide different views on the same underlying bits.
BitDescription
0Carry of Flag Group 1
1MSb of Flag Group 1
2LSb of Flag Group 1
3Zero of Flag Group 1
0x7C8 RW FLAGS Wide arithmetic flag groups. This CSR provides access to both flags groups used by wide integer arithmetic. FLAGS, FG0 and FG1 provide different views on the same underlying bits.
BitDescription
0Carry of Flag Group 0
1MSb of Flag Group 0
2LSb of Flag Group 0
3Zero of Flag Group 0
4Carry of Flag Group 1
5MSb of Flag Group 1
6LSb of Flag Group 1
7Zero of Flag Group 1
0x7D0 RW MOD0 Bits [31:0] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D1 RW MOD1 Bits [63:32] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D2 RW MOD2 Bits [95:64] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D3 RW MOD3 Bits [127:96] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D4 RW MOD4 Bits [159:128] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D5 RW MOD5 Bits [191:160] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D6 RW MOD6 Bits [223:192] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D7 RW MOD7 Bits [255:224] of the modulus operand, used in the BN.ADDM / BN.SUBM instructions. This CSR is mapped to the MOD WSR.
0x7D8 RW RND_PREFETCH Write to this CSR to begin a request to fill the RND cache. Always reads as 0.
0xFC0 RO RND An AIS31-compliant class PTG.3 random number with guaranteed entropy and forward and backward secrecy. Primarily intended to be used for key generation.

The number is sourced from the EDN via a single-entry cache. Reads when the cache is empty will cause OTBN to be stalled until a new random number is fetched from the EDN.

0xFC1 RO URND A random number without guaranteed secrecy properties or specific statistical properties. Intended for use in masking and blinding schemes. Use RND for high-quality randomness.

The number is sourced from an local PRNG. Reads never stall.

Wide Data Registers (WDRs)

In addition to the 32b wide GPRs, OTBN has a second “wide” register file, which is used by the big number instruction subset. This register file consists of NWDR = 32 Wide Data Registers (WDRs). Each WDR is WLEN = 256b wide.

Wide Data Registers (WDRs) and the 32b General Purpose Registers (GPRs) are separate register files. They are only accessible through their respective instruction subset: GPRs are accessible from the base instruction subset, and WDRs are accessible from the big number instruction subset (BN instructions).

Register
w0
w1
w31

Wide Special Purpose Registers (WSRs)

OTBN has 256b Wide Special purpose Registers (WSRs). These are analogous to the 32b CSRs, but are used by big number instructions. They can be accessed with the BN.WSRR and BN.WSRW instructions. Writes to read-only (RO) registers are ignored; they do not signal an error. All read-write (RW) WSRs are set to 0 when OTBN starts an operation (when 1 is written to CMD.start).

Number Access Name Description
0x0 RW MOD

The modulus used by the BN.ADDM and BN.SUBM instructions. This WSR is also visible as CSRs MOD0 through to MOD7.

0x1 RO RND An AIS31-compliant class PTG.3 random number with guaranteed entropy and forward and backward secrecy. Primarily intended to be used for key generation.

The number is sourced from the EDN via a single-entry cache. Reads when the cache is empty will cause OTBN to be stalled until a new random number is fetched from the EDN.

0x2 RO URND A random number without guaranteed secrecy properties or specific statistical properties. Intended for use in masking and blinding schemes. Use RND for high-quality randomness.

The number is sourced from a local PRNG. Reads never stall.

0x3 RW ACC The accumulator register used by the BN.MULQACC instruction.
0x4 RO KEY_S0_L Bits [255:0] of share 0 of the 384b OTBN sideload key provided by the Key Manager.

A KEY_INVALID software error is raised on read if the Key Manager has not provided a key.

0x5 RO KEY_S0_H Bits [255:128] of this register are always zero. Bits [127:0] contain bits [383:256] of share 0 of the 384b OTBN sideload key provided by the Key Manager.

A KEY_INVALID software error is raised on read if the Key Manager has not provided a valid key.

0x6 RO KEY_S1_L Bits [255:0] of share 1 of the 384b OTBN sideload key provided by the Key Manager.

A KEY_INVALID software error is raised on read if the Key Manager has not provided a valid key.

0x7 RO KEY_S1_H Bits [255:128] of this register are always zero. Bits [127:0] contain bits [383:256] of share 1 of the 384b OTBN sideload key provided by the Key Manager.

A KEY_INVALID software error is raised on read if the Key Manager has not provided a valid key.

Flags

In addition to the wide register file, OTBN maintains global state in two groups of flags for the use by wide integer operations. Flag groups are named Flag Group 0 (FG0), and Flag Group 1 (FG1). Each group consists of four flags. Each flag is a single bit.

  • C (Carry flag). Set to 1 an overflow occurred in the last arithmetic instruction.

  • M (MSb flag) The most significant bit of the result of the last arithmetic or shift instruction.

  • L (LSb flag). The least significant bit of the result of the last arithmetic or shift instruction.

  • Z (Zero Flag) Set to 1 if the result of the last operation was zero; otherwise 0.

The M, L, and Z flags are determined based on the result of the operation as it is written back into the result register, without considering the overflow bit.

Loop Stack

OTBN has two instructions for hardware-assisted loops: LOOP and LOOPI . Both use the same state for tracking control flow. This is a stack of tuples containing a loop count, start address and end address. The stack has a maximum depth of eight and the top of the stack is the current loop.

Security Features

Work in progress

Work on OTBN is ongoing, including work on the specification and implementation of its security features. Do not treat the following description (or anything in this documentation) as final, fully implemented, or verified.

OTBN is a security co-processor. It contains various security features and is hardened against side-channel analysis and fault injection attacks. The following sections describe the high-level security features of OTBN. Refer to the Design Details section for a more in-depth description.

Data Integrity Protection

OTBN’s data integrity protection is designed to protect the data stored and processed within OTBN from modifications through physical attacks.

Data in OTBN travels along a data path which includes the data memory (DMEM), the load-store-unit (LSU), the register files (GPR and WDR), and the execution units. Whenever possible, data transmitted or stored within OTBN is protected with an integrity protection code which guarantees the detection of at least three modified bits per 32 bit word. Additionally, instructions and data stored in the instruction and data memory, respectively, are scrambled with a lightweight, non-cryptographically-secure cipher.

Refer to the Data Integrity Protection section for details of how the data integrity protections are implemented.

Secure Wipe

OTBN provides a mechanism to securely wipe all state it stores, including the instruction memory.

The full secure wipe mechanism is split into three parts:

A secure wipe is performed automatically in certain situations, or can be requested manually by the host software. The full secure wipe is automatically initiated as a local reaction to a fatal error. A secure wipe of only the internal state is performed whenever an OTBN operation is complete and after a recoverable error. Finally, host software can manually trigger the data memory and instruction memory secure wipe operations by issuing an appropriate command.

Refer to the Secure Wipe section for implementation details.

Instruction Counter

In order to detect and mitigate fault injection attacks on the OTBN, the host CPU can read the number of executed instructions from INSN_CNT and verify whether it matches the expectation.

Key Sideloading

OTBN software can make use of a single 384b wide key provided by the Key Manager, which is made available in two shares. The key is passed through a dedicated connection between the Key Manager and OTBN to avoid exposing it to other components. Software can access the first share of the key through the KEY_S0_L and KEY_S0_H WSRs, and the second share of the key through the KEY_S1_L and KEY_S1_H WSRs.

It is up to host software to configure the Key Manager so that it provides the right key to OTBN at the start of the operation, and to remove the key again once the operation on OTBN has completed. A KEY_INVALID software error is raised if OTBN software accesses any of the KEY_* WSRs when the Key Manager has not presented a key.

Theory of Operations

Block Diagram

OTBN architecture block diagram

Hardware Interfaces

Referring to the Comportable guideline for peripheral device functionality, the module otbn has the following hardware interfaces defined.

Primary Clock: clk_i

Other Clocks: clk_edn_i, clk_otp_i

Bus Device Interfaces (TL-UL): tl

Bus Host Interfaces (TL-UL): none

Peripheral Pins for Chip IO: none

Interrupts:

Interrupt NameDescription
done

OTBN has completed the operation.

Security Alerts:

Alert NameDescription
fatal

A fatal error. Fatal alerts are non-recoverable and will be asserted until a hard reset.

recov

A recoverable error. Just sent once (as the processor stops).

Hardware Interface Requirements

OTBN connects to other components in an OpenTitan system. This section lists requirements on those interfaces that go beyond the physical connectivity.

Entropy Distribution Network (EDN)

OTBN has two EDN connections: edn_urnd and edn_rnd. What kind of randomness is provided on the EDN connections is configurable at runtime, but unknown to OTBN. To maintain its security properties, OTBN requires the following configuration for the two EDN connections:

  • OTBN has no specific requirements on the randomness drawn from edn_urnd. For performance reasons, requests on this EDN connection should be answered quickly.
  • edn_rnd must provide AIS31-compliant class PTG.3 random numbers. The randomness from this interface is made available through the RND WSR and intended to be used for key generation.

Design Details

Memories

The OTBN processor core has access to two dedicated memories: an instruction memory (IMEM), and a data memory (DMEM). Each memory is 4 kiB in size.

The memory layout follows the Harvard architecture. Both memories are byte-addressed, with addresses starting at 0.

The instruction memory (IMEM) is 32b wide and provides the instruction stream to the OTBN processor. It cannot be read from or written to by user code through load or store instructions.

The data memory (DMEM) is 256b wide and read-write accessible from the base and big number instruction subsets of the OTBN processor core. There are four instructions that can access data memory. In the base instruction subset, there are LW (load word) and SW (store word). These access 32b-aligned 32b words. In the big number instruction subset, there are BN.LID (load indirect) and BN.SID (store indirect). These access 256b-aligned 256b words.

Both memories can be accessed through OTBN’s register interface (DMEM and IMEM). These accesses are ignored if OTBN is busy. A host processor can check whether OTBN is busy by reading the STATUS register. All memory accesses through the register interface must be word-aligned 32b word accesses.

While DMEM is 4kiB, only the first 2kiB (at addresses 0x0 to 0x7ff) is visible through the register interface. This is to allow OTBN applications to store sensitive information in the other half, making it harder for that information to leak back to Ibex.

Each memory write through the register interface updates a checksum. See the Memory Load Integrity section for more details.

Random Numbers

OTBN is connected to the Entropy Distribution Network (EDN) which can provide random numbers via the RND and URND CSRs and WSRs.

RND provides bits taken directly from the EDN connected via edn_rnd. The EDN interface provides 32b of entropy per transaction and comes from a different clock domain to the OTBN core. A FIFO is used to synchronize the incoming package to the OTBN clock domain. Synchronized packages are then set starting from bottom up to a single WLEN value of 256b. In order to service a single EDN request, a total of 8 transactions are required from EDN interface.

As an EDN request can take time, RND is backed by a single-entry cache containing the result of the most recent EDN request in OTBN core level. A read from RND empties this cache. A prefetch into the cache, which can be used to hide the EDN latency, is triggered on any write to the RND_PREFETCH CSR. Writes to RND_PREFETCH will be ignored whilst a prefetch is in progress or when the cache is already full. OTBN will stall until the request provides bits. Both the RND CSR and WSR take their bits from the same cache. RND CSR reads get bottom 32b and simply discard the other 192b on a read. When stalling on an RND read, OTBN will unstall on the cycle after it receives WLEN RND data from the EDN.

URND provides bits from an local PRNG within OTBN; reads from it never stall. The URND LFSR is seeded once from the EDN connected via edn_urnd when OTBN starts execution. Each new execution of OTBN will reseed the URND PRNG. The PRNG state is advanced every cycle when OTBN is running.

The PRNG has a long cycle length but has a fixed point: the sequence of numbers will get stuck if the state ever happens to become zero. This will never happen in normal operation. If a fault causes the state to become zero, OTBN raises a BAD_INTERNAL_STATE fatal error.

Operational States

OTBN operational states

OTBN can be in different operational states. OTBN is busy for as long it is performing an operation. OTBN is locked if a fatal error was observed. Otherwise OTBN is idle.

The current operational state is reflected in the STATUS register.

  • If OTBN is idle, the STATUS register is set to IDLE.
  • If OTBN is busy, the STATUS register is set to one of the values starting with BUSY_.
  • If OTBN is locked, the STATUS register is set to LOCKED.

OTBN transitions into the busy state as result of host software issuing a command; OTBN is then said to perform an operation. OTBN transitions out of the busy state whenever the operation has completed. In the STATUS register the different BUSY_* values represent the operation that is currently being performed.

A transition out of the busy state is signaled by the done interrupt (INTR_STATE.done).

The locked state is a terminal state; transitioning out of it requires an OTBN reset.

Operations and Commands

OTBN understands a set of commands to perform certain operations. Commands are issued by writing to the CMD register.

The EXECUTE command starts the execution of the application contained in OTBN’s instruction memory.

The SEC_WIPE_DMEM command securely wipes the data memory.

The SEC_WIPE_IMEM command securely wipes the instruction memory.

Software Execution

Software execution on OTBN is triggered by host software by issuing the EXECUTE command. The software then runs to completion, without the ability for host software to interrupt or inspect the execution.

  • OTBN transitions into the busy state, and reflects this by setting STATUS to BUSY_EXECUTE.
  • The internal randomness source, which provides random numbers to the URND CSR and WSR, is re-seeded from the EDN.
  • The instruction at address zero is fetched and executed.
  • From this point on, all subsequent instructions are executed according to their semantics until either an ECALL instruction is executed, or an error is detected.
  • A secure wipe of internal state is performed.
  • The ERR_BITS register is set to indicate either a successful execution (value 0), or to indicate the error that was observed (a non-zero value).
  • OTBN transitions into the idle state (in case of a successful execution, or a recoverable error) or the locked state (in case of a fatal error). This transition is signaled by raising the done interrupt (INTR_STATE.done), and reflected in the STATUS register.

Errors

OTBN is able to detect a range of errors, which are classified as software errors or fatal errors. A software error is an error in the code that OTBN executes. In the absence of an attacker, these errors are due to a programmer’s mistake. A fatal error is typically the violation of a security property. All errors and their classification are listed in the List of Errors.

Whenever an error is detected, OTBN reacts locally, and informs the OpenTitan system about it by raising an alert. OTBN generally does not try to recover from errors itself, and provides no error handling support to code that runs on it.

OTBN gives host software the option to recover from some errors by restarting the operation. All software errors are treated as recoverable, unless CTRL.software_errs_fatal is set, and are handled as described in the section Reaction to Recoverable Errors. When CTRL.software_errs_fatal is set, software errors become fatal errors.

Fatal errors are treated as described in the section Reaction to Fatal Errors.

Reaction to Recoverable Errors

Recoverable errors can be the result of a programming error in OTBN software. Recoverable errors can only occur during the execution of software on OTBN, and not in other situations in which OTBN might be busy.

The following actions are taken when OTBN detects a recoverable error:

  1. The currently running operation is terminated, similar to the way an ECALL instruction is executed:
  2. A recoverable alert is raised.

The host software can start another operation on OTBN after a recoverable error was detected.

Reaction to Fatal Errors

Fatal errors are generally seen as a sign of an intrusion, resulting in more drastic measures to protect the secrets stored within OTBN. Fatal errors can occur at any time, even when an OTBN operation isn’t in progress.

The following actions are taken when OTBN detects a fatal error:

  1. A secure wipe of the data memory and a secure wipe of the instruction memory is initiated.
  2. If OTBN is not idle, then the currently running operation is terminated, similarly to how an operation ends after an ECALL instruction is executed:
  3. The STATUS register is set to LOCKED.
  4. A fatal alert is raised.

Note that OTBN can detect some errors even when it isn’t running. One example of this is an error caused by an integrity error when reading or writing OTBN’s memories over the bus. In this case, the ERR_BITS register will not change. This avoids race conditions with the host processor’s error handling software. However, every error that OTBN detects when it isn’t running is fatal. This means that the cause will be reflected in FATAL_ALERT_CAUSE, as described below in Alerts. This way, no alert is generated without setting an error code somewhere.

List of Errors

Name Class Description
BAD_DATA_ADDR software A data memory access occurred with an out of bounds or unaligned access.
BAD_INSN_ADDR software An instruction memory access occurred with an out of bounds or unaligned access.
CALL_STACK software An instruction tried to pop from an empty call stack or push to a full call stack.
ILLEGAL_INSN software An illegal instruction was about to be executed.
LOOP software A loop stack-related error was detected.
KEY_INVALID software An attempt to read a `KEY_*` WSR was detected, but no key was provided by the key manager.
IMEM_INTG_VIOLATION fatal Data read from the instruction memory failed the integrity checks.
DMEM_INTG_VIOLATION fatal Data read from the data memory failed the integrity checks.
REG_INTG_VIOLATION fatal Data read from a GPR or WDR failed the integrity checks.
BUS_INTG_VIOLATION fatal An incoming bus transaction failed the integrity checks.
BAD_INTERNAL_STATE fatal The internal state of OTBN has become corrupt.
ILLEGAL_BUS_ACCESS fatal A bus-accessible register or memory was accessed when not allowed.
LIFECYCLE_ESCALATION fatal A life cycle escalation request was received.
FATAL_SOFTWARE fatal A software error was seen and CTRL.software_errs_fatal was set.

Alerts

An alert is a reaction to an error that OTBN detected. OTBN has two alerts, one recoverable and one fatal.

A recoverable alert is a one-time triggered alert caused by recoverable errors. The error that caused the alert can be determined by reading the ERR_BITS register.

A fatal alert is a continuously triggered alert caused by fatal errors. The error that caused the alert can be determined by reading the FATAL_ALERT_CAUSE register. If OTBN was running, this value will also be reflected in the ERR_BITS register. A fatal alert can only be cleared by resetting OTBN through the rst_ni line.

Reaction to Life Cycle Escalation Requests

OTBN receives and reacts to escalation signals from the life cycle controller. An incoming life cycle escalation is a fatal error of type lifecycle_escalation and treated as described in the section Fatal Errors.

Idle

OTBN exposes a single-bit idle_o signal, intended to be used by the clock manager to clock-gate the block when it is not in use. This signal is in the same clock domain as clk_i. The idle_o signal is high when OTBN is idle, and low otherwise.

OTBN also exposes another version of the idle signal as idle_otp_o. This works analogously, but is in the same clock domain as clk_otp_i.

TODO: Specify interactions between idle_o, idle_otp_o and the clock manager fully.

Data Integrity Protection

OTBN stores and operates on data (state) in its dedicated memories, register files, and internal registers. OTBN’s data integrity protection is designed to protect all data stored and transmitted within OTBN from modifications through physical attacks.

During transmission, the integrity of data is protected with an integrity protection code. Data at rest in the instruction and data memories is additionally scrambled.

In the following, the Integrity Protection Code and the scrambling algorithm are discussed, followed by their application to individual storage elements.

Integrity Protection Code

OTBN uses the same integrity protection code everywhere to provide overarching data protection without regular re-encoding. The code is applied to 32b data words, and produces 39b of encoded data.

The code used is an (39,32) Hsiao “single error correction, double error detection” (SECDED) error correction code (ECC) [CHEN08]. It has a minimum Hamming distance of four, resulting in the ability to detect at least three errors in a 32 bit word. The code is used for error detection only; no error correction is performed.

Memory Scrambling

Contents of OTBN’s instruction and data memories are scrambled while at rest. The data is bound to the address and scrambled before being stored in memory. The addresses are randomly remapped.

Note that data stored in other temporary memories within OTBN, including the register files, is not scrambled.

Scrambling is used to obfuscate the memory contents and to diffuse the data. Obfuscation makes passive probing more difficult, while diffusion makes active fault injection attacks more difficult.

The scrambling mechanism is described in detail in the section “Scrambling Primitive” of the SRAM Controller Technical Specification.

When OTBN comes out of reset, its memories have default scrambling keys. The host processor can request new keys for each memory by issuing a secure wipe of DMEM and a secure wipe of IMEM.

Actions on Integrity Errors

A fatal error is raised whenever a data integrity violation is detected, which results in an immediate stop of all processing and the issuing of a fatal alert. The section Error Handling and Reporting describes the error handling in more detail.

Register File Integrity Protection

OTBN contains two register files: the 32b GPRs and the 256b WDRs. The data stored in both register files is protected with the Integrity Protection Code. Neither the register file contents nor register addresses are scrambled.

The GPRs x2 to x31 store a 32b data word together with the Integrity Protection Code, resulting in 39b of stored data. (x0, the zero register, and x1, the call stack, require special treatment.)

Each 256b Wide Data Register (WDR) stores a 256b data word together with the Integrity Protection Code, resulting in 312b of stored data. The integrity protection is done separately for each of the eight 32b sub-words within a 256b word.

The register files can consume data protected with the Integrity Protection Code, or add it on demand. Whenever possible the Integrity Protection Code is preserved from its source and written directly to the register files without recalculation, in particular in the following cases:

  • Data coming from the data memory (DMEM) through the load-store unit to a GPR or WDR.
  • Data copied between WDRs using the BN.MOV or BN.MOVR instructions.
  • Data conditionally copied between WDRs using the BN.SEL instruction.
  • Data copied between the ACC and MOD WSRs and a WDR. (TODO: Not yet implemented.)
  • Data copied between any of the MOD0 to MOD7 CSRs and a GPR. (TODO: Not yet implemented.)

In all other cases the register files add the Integrity Protection Code to the incoming data before storing the data word.

The integrity protection bits are checked on every read from the register files, even if the integrity protection is not removed from the data.

Detected integrity violations in a register file raise a fatal reg_error.

Data Memory (DMEM) Integrity Protection

OTBN’s data memory is 256b wide, but allows for 32b word accesses. To facilitate such accesses, all integrity protection in the data memory is done on a 32b word granularity.

All data entering or leaving the data memory block is protected with the Integrity Protection Code; this code is not re-computed within the memory block.

Before being stored in SRAM, the data word with the attached Integrity Protection Code, as well as the address are scrambled according to the memory scrambling algorithm. The scrambling is reversed on a read.

The ephemeral memory scrambling key and the nonce are provided by the OTP block. They are set once when OTBN block is reset, and changed whenever a secure wipe of the data memory is performed.

The Integrity Protection Code is checked on every memory read, even though the code remains attached to the data. A further check must be performed when the data is consumed. Detected integrity violations in the data memory raise a fatal dmem_error.

Instruction Memory (IMEM) Integrity Protection

All data entering or leaving the instruction memory block is protected with the Integrity Protection Code; this code is not re-computed within the memory block.

Before being stored in SRAM, the instruction word with the attached Integrity Protection Code, as well as the address are scrambled according to the memory scrambling algorithm. The scrambling is reversed on a read.

The ephemeral memory scrambling key and the nonce are provided by the OTP block. They are set once when OTBN block is reset, and changed whenever a secure wipe of the instruction memory is performed.

The Integrity Protection Code is checked on every memory read, even though the code remains attached to the data. A further check must be performed when the data is consumed. Detected integrity violations in the data memory raise a fatal imem_error.

Memory Load Integrity

As well as the integrity protection discussed above for the memories and bus interface, OTBN has a second layer of integrity checking to allow a host processor to ensure that a program has been loaded correctly. This is visible through the LOAD_CHECKSUM register. The register exposes a cumulative CRC checksum which is updated on every write to either memory.

This is intended as a light-weight way to implement a more efficient “write and read back” check. It isn’t a cryptographically secure MAC, so cannot spot an attacker who can completely control the bus. However, in this case the attacker would be equally able to control responses from OTBN, so any such check could be subverted.

The CRC used is the 32-bit CRC-32-IEEE checksum. This standard choice of generating polynomial makes it compatible with other tooling, such as the POSIX cksum utility [POSIX18]. The stream over which the checksum is computed is the stream of writes that have been seen since the last write to LOAD_CHECKSUM. Each write is treated as a 48b value, {imem, idx, wdata}. Here, imem is a single bit flag which is one for writes to IMEM and zero for writes to DMEM. The idx value is the index of the word within the memory, zero extended from 10b to 15b. Finally, wdata is the 32b word that was written.

The host processor can also write to the register. Typically, this will be to clear the value to 32'hffffffff, the traditional starting value for a 32-bit CRC.

To use this functionality, the host processor should set LOAD_CHECKSUM to a known value (traditionally, 32'hffffffff). Next, it should write the program to be loaded to OTBN’s IMEM and DMEM over the bus. Finally, it should read back the value of LOAD_CHECKSUM and compare it with an expected value.

Secure Wipe

Applications running on OTBN may store sensitive data in the internal registers or the memory. In order to prevent an untrusted application from reading any leftover data, OTBN provides the secure wipe operation. This operation can be applied to:

The three forms of secure wipe can be triggered in different ways.

A secure wipe of either the instruction or the data memory can be triggered from from host software by issuing a SEC_WIPE_DMEM or SEC_WIPE_IMEM command.

A secure wipe of instruction memory, data memory, and all internal state is performed automatically when handling a fatal error.

A secure wipe of the internal state only is triggered automatically when OTBN ends the software execution, either successfully, or unsuccessfully due to a recoverable error.

Data Memory (DMEM) Secure Wipe

The wiping is performed by securely replacing the memory scrambling key, making all data stored in the memory unusable. The key replacement is a two-step process:

  • Overwrite the 128b key of the memory scrambling primitive with randomness from URND. This action takes a single cycle.
  • Request new scrambling parameters from OTP. The request takes multiple cycles to complete.

Host software can initiate a data memory secure wipe by issuing the SEC_WIPE_DMEM command.

Instruction Memory (IMEM) Secure Wipe

The wiping is performed by securely replacing the memory scrambling key, making all instructions stored in the memory unusable. The key replacement is a two-step process:

  • Overwrite the 128b key of the memory scrambling primitive with randomness from URND. This action takes a single cycle.
  • Request new scrambling parameters from OTP. The request takes multiple cycles to complete.

Host software can initiate a data memory secure wipe by issuing the SEC_WIPE_IMEM command.

Internal State Secure Wipe

OTBN provides a mechanism to securely wipe all internal state, excluding the instruction and data memories.

The following state is wiped:

  • Register files: GPRs and WDRs
  • The accumulator register (also accessible through the ACC WSR)
  • Flags (accessible through the FG0, FG1, and FLAGS CSRs)
  • The modulus (accessible through the MOD0 to MOD7 CSRs and the MOD WSR)

The wiping procedure is a two-step process:

  • Overwrite the state with randomness from URND.
  • Overwrite the state with zeros.

Loop and call stack pointers are reset.

Host software cannot explicitly trigger an internal secure wipe; it is performed automatically at the end of an EXECUTE operation.

Running applications on OTBN

OTBN is a specialized coprocessor which is used from the host CPU. This section describes how to interact with OTBN from the host CPU to execute an existing OTBN application. The section Writing OTBN applications describes how to write such applications.

High-level operation sequence

The high-level sequence by which the host processor should use OTBN is as follows.

  1. Optional: Initialise LOAD_CHECKSUM.
  2. Write the OTBN application binary to IMEM, starting at address 0.
  3. Optional: Write constants and input arguments, as mandated by the calling convention of the loaded application, to the half of DMEM accessible through the DMEM window.
  4. Optional: Read back LOAD_CHECKSUM and perform an integrity check.
  5. Start the operation on OTBN by issuing the EXECUTE command. Now neither data nor instruction memory may be accessed from the host CPU. After it has been started the OTBN application runs to completion without further interaction with the host.
  6. Wait for the operation to complete (see below). As soon as the OTBN operation has completed the data and instruction memories can be accessed again from the host CPU.
  7. Check if the operation was successful by reading the ERR_BITS register.
  8. Optional: Retrieve results by reading DMEM, as mandated by the calling convention of the loaded application.

OTBN applications are run to completion. The host CPU can determine if an application has completed by either polling STATUS or listening for an interrupt.

  • To poll for a completed operation, software should repeatedly read the STATUS register. The operation is complete if STATUS is IDLE or LOCKED, otherwise the operation is in progress. When STATUS has become LOCKED a fatal error has occurred and OTBN must be reset to perform further operations.
  • Alternatively, software can listen for the done interrupt to determine if the operation has completed. The standard sequence of working with interrupts has to be followed, i.e. the interrupt has to be enabled, an interrupt service routine has to be registered, etc. The DIF contains helpers to do so conveniently.

Note: This operation sequence only covers functional aspects. Depending on the application additional steps might be necessary, such as deleting secrets from the memories.

Device Interface Functions (DIFs)

To use this DIF, include the following C header:

#include "sw/device/lib/dif/dif_otbn.h"

This header provides the following device interface functions:

Driver

A higher-level driver for the OTBN block is available at sw/device/lib/runtime/otbn.h (API documentation).

Another driver for OTBN is part of the silicon creator code at sw/device/silicon_creator/lib/drivers/otbn.h.

Register Table

otbn.INTR_STATE @ 0x0

Interrupt State Register

Reset default = 0x0, mask 0x1
31302928272625242322212019181716
 
1514131211109876543210
  done
BitsTypeResetNameDescription
0rw1c0x0done

OTBN has completed the operation.


otbn.INTR_ENABLE @ 0x4

Interrupt Enable Register

Reset default = 0x0, mask 0x1
31302928272625242322212019181716
 
1514131211109876543210
  done
BitsTypeResetNameDescription
0rw0x0done

Enable interrupt when INTR_STATE.done is set.


otbn.INTR_TEST @ 0x8

Interrupt Test Register

Reset default = 0x0, mask 0x1
31302928272625242322212019181716
 
1514131211109876543210
  done
BitsTypeResetNameDescription
0wo0x0done

Write 1 to force INTR_STATE.done to 1.


otbn.ALERT_TEST @ 0xc

Alert Test Register

Reset default = 0x0, mask 0x3
31302928272625242322212019181716
 
1514131211109876543210
  recov fatal
BitsTypeResetNameDescription
0wo0x0fatal

Write 1 to trigger one alert event of this kind.

1wo0x0recov

Write 1 to trigger one alert event of this kind.


otbn.CMD @ 0x10

Command Register

Reset default = 0x0, mask 0xff

A command initiates an OTBN operation. While performing the operation, OTBN is busy; the STATUS register reflects that.

All operations signal their completion by raising the done interrupt; alternatively, software may poll the STATUS register.

Writes are ignored if OTBN is not idle. Unrecognized commands are ignored.

31302928272625242322212019181716
 
1514131211109876543210
  cmd
BitsTypeResetNameDescription
7:0wo0x0cmd

The operation to perform.

Value Name Description
0xd8 EXECUTE Starts the execution of the program stored in the instruction memory, starting at address zero.
0xc3 SEC_WIPE_DMEM Securely removes all contents from the data memory.
0x1e SEC_WIPE_IMEM Securely removes all contents from the instruction memory.


otbn.CTRL @ 0x14

Control Register

Reset default = 0x0, mask 0x1
31302928272625242322212019181716
 
1514131211109876543210
  software_errs_fatal
BitsTypeResetNameDescription
0rw0x0software_errs_fatal

Controls the reaction to software errors.

When set software errors produce fatal errors, rather than recoverable errors.

Writes are ignored if OTBN is not idle.


otbn.STATUS @ 0x18

Status Register

Reset default = 0x0, mask 0xff
31302928272625242322212019181716
 
1514131211109876543210
  status
BitsTypeResetNameDescription
7:0ro0x0status

Indicates the current operational state OTBN is in.

All BUSY values represent an operation started by a write to the CMD register.

Value Name Description
0x00 IDLE OTBN is idle: it is not performing any action.
0x01 BUSY_EXECUTE OTBN is busy executing software.
0x02 BUSY_SEC_WIPE_DMEM OTBN is busy securely wiping the data memory.
0x03 BUSY_SEC_WIPE_IMEM OTBN is busy securely wiping the instruction memory.
0xFF LOCKED OTBN is locked as reaction to a fatal error, and must be reset to unlock it again. See also the the section "Reaction to Fatal Errors".


otbn.ERR_BITS @ 0x1c

Operation Result Register

Reset default = 0x0, mask 0xff003f

Describes the errors detected during an operation.

Refer to the "List of Errors" section for a detailed description of the errors.

31302928272625242322212019181716
  fatal_software lifecycle_escalation illegal_bus_access bad_internal_state bus_intg_violation reg_intg_violation dmem_intg_violation imem_intg_violation
1514131211109876543210
  key_invalid loop illegal_insn call_stack bad_insn_addr bad_data_addr
BitsTypeResetNameDescription
0ro0x0bad_data_addr

A BAD_DATA_ADDR error was observed.

1ro0x0bad_insn_addr

A BAD_INSN_ADDR error was observed.

2ro0x0call_stack

A CALL_STACK error was observed.

3ro0x0illegal_insn

An ILLEGAL_INSN error was observed.

4ro0x0loop

A LOOP error was observed.

5ro0x0key_invalid

A KEY_INVALID error was observed.

15:6Reserved
16ro0x0imem_intg_violation

A IMEM_INTG_VIOLATION error was observed.

17ro0x0dmem_intg_violation

A DMEM_INTG_VIOLATION error was observed.

18ro0x0reg_intg_violation

A REG_INTG_VIOLATION error was observed.

19ro0x0bus_intg_violation

A BUS_INTG_VIOLATION error was observed.

20ro0x0bad_internal_state

A BAD_INTERNAL_STATE error was observed.

21ro0x0illegal_bus_access

An ILLEGAL_BUS_ACCESS error was observed.

22ro0x0lifecycle_escalation

A LIFECYCLE_ESCALATION error was observed.

23ro0x0fatal_software

A FATAL_SOFTWARE error was observed.


otbn.FATAL_ALERT_CAUSE @ 0x20

Fatal Alert Cause Register

Reset default = 0x0, mask 0xff

Describes any errors that led to a fatal alert. A fatal error puts OTBN in locked state; the value of this register does not change until OTBN is reset.

Refer to the "List of Errors" section for a detailed description of the errors.

31302928272625242322212019181716
 
1514131211109876543210
  fatal_software lifecycle_escalation illegal_bus_access bad_internal_state bus_intg_violation reg_intg_violation dmem_intg_violation imem_intg_violation
BitsTypeResetNameDescription
0ro0x0imem_intg_violation

A IMEM_INTG_VIOLATION error was observed.

1ro0x0dmem_intg_violation

A DMEM_INTG_VIOLATION error was observed.

2ro0x0reg_intg_violation

A REG_INTG_VIOLATION error was observed.

3ro0x0bus_intg_violation

A BUS_INTG_VIOLATION error was observed.

4ro0x0bad_internal_state

A BAD_INTERNAL_STATE error was observed.

5ro0x0illegal_bus_access

A ILLEGAL_BUS_ACCESS error was observed.

6ro0x0lifecycle_escalation

A LIFECYCLE_ESCALATION error was observed.

7ro0x0fatal_software

A FATAL_SOFTWARE error was observed.


otbn.INSN_CNT @ 0x24

Instruction Count Register

Reset default = 0x0, mask 0xffffffff

Returns the number of instructions executed in the current or last operation. The counter saturates at 2^32-1 and is reset to 0 at the start of a new operation.

Only the EXECUTE operation counts instructions; for all other operations this register remains at 0. Instructions triggering an error do not count towards the total.

Always reads as 0 if OTBN is locked.

31302928272625242322212019181716
insn_cnt...
1514131211109876543210
...insn_cnt
BitsTypeResetNameDescription
31:0ro0x0insn_cnt

The number of executed instructions.


otbn.LOAD_CHECKSUM @ 0x28

A 32-bit CRC checksum of data written to memory

Reset default = 0x0, mask 0xffffffff

See the "Memory Load Integrity" section of the manual for full details.

31302928272625242322212019181716
checksum...
1514131211109876543210
...checksum
BitsTypeResetNameDescription
31:0rw0x0checksum

Checksum accumulator


otbn.IMEM @ + 0x4000
1024 item rw window
Byte writes are not supported
310
+0x4000 
+0x4004 
 ...
+0x4ff8 
+0x4ffc 

Instruction Memory Access

The instruction memory may only be accessed through this window while OTBN is idle.

If OTBN is busy or locked, read accesses return 0 and write accesses are ignored. If OTBN is busy, any access additionally triggers an ILLEGAL_BUS_ACCESS fatal error.


otbn.DMEM @ + 0x8000
512 item rw window
Byte writes are not supported
310
+0x8000 
+0x8004 
 ...
+0x87f8 
+0x87fc 

Data Memory Access

The data memory may only be accessed through this window while OTBN is idle.

If OTBN is busy or locked, read accesses return 0 and write accesses are ignored. If OTBN is busy, any access additionally triggers an ILLEGAL_BUS_ACCESS fatal error.

Note that DMEM is actually 4kiB in size, but only the first 2kiB of the memory is visible through this register interface.


Writing OTBN applications

OTBN applications are (small) pieces of software written in OTBN assembly. The full instruction set is described in the ISA manual, and example software is available in the sw/otbn directory of the OpenTitan source tree.

A hands-on user guide to develop OTBN software can be found in the section Writing and building software for OTBN.

Toolchain support

OTBN comes with a toolchain consisting of an assembler, a linker, and helper tools such as objdump. The toolchain wraps a RV32 GCC toolchain and supports many of its features.

The following tools are available:

  • otbn-as: The OTBN assembler.
  • otbn-ld: The OTBN linker.
  • otbn-objdump: objdump for OTBN.

Other tools from the RV32 toolchain can be used directly, such as objcopy.

Passing of data between the host CPU and OTBN

Passing data between the host CPU and OTBN is done through the first 2kiB of data memory (DMEM). No standard or required calling convention exists, every application is free to pass data in and out of OTBN in whatever format it finds convenient. All data passing must be done when OTBN is idle; otherwise both the instruction and the data memory are inaccessible from the host CPU.

Returning from an application

The software running on OTBN signals completion by executing the ECALL instruction.

Once OTBN has executed the ECALL instruction, the following things happen:

The first 2kiB of DMEM can be used to pass data back to the host processor, e.g. a “return value” or an “exit code”. Refer to the section Passing of data between the host CPU and OTBN for more information.

Using hardware loops

OTBN provides two hardware loop instructions: LOOP and LOOPI .

Loop nesting

OTBN permits loop nesting and branches and jumps inside loops. However, it doesn’t have support for early termination of loops: there’s no way to pop an entry from the loop stack without executing the last instruction of the loop the correct number of times. It can also only pop one level of the loop stack per instruction.

To avoid polluting the loop stack and avoid surprising behaviour, the programmer must ensure that:

  • Even if there are branches and jumps within a loop body, the final instruction of the loop body gets executed exactly once per iteration.
  • Nested loops have distinct end addresses.
  • The end instruction of an outer loop is not executed before an inner loop finishes.

OTBN does not detect these conditions being violated, so no error will be signaled should they occur.

(Note indentation in the code examples is for clarity and has no functional impact.)

The following loops are well nested:

LOOP x2, 3
  LOOP x3, 1
    ADDI x4, x4, 1
  # The NOP ensures that the outer and inner loops end on different instructions
  NOP

# Both inner and outer loops call some_fn, which returns to
# the body of the loop
LOOP x2, 5
  JAL x1, some_fn
  LOOP x3, 2
    JAL x1, some_fn
    ADDI x4, x4, 1
  NOP

# Control flow leaves the immediate body of the outer loop but eventually
# returns to it
LOOP x2, 4
  BEQ x4, x5, some_label
branch_back:
  LOOP x3, 1
    ADDI x6, x6, 1
  NOP

some_label:
  ...
  JAL x0, branch_back

The following loops are not well nested:

# Both loops end on the same instruction
LOOP x2, 2
  LOOP x3, 1
    ADDI x4, x4, 1

# Inner loop jumps into outer loop body (executing the outer loop end
# instruction before the inner loop has finished)
LOOP x2, 5
  LOOP x3, 3
    ADDI x4, x4 ,1
    BEQ  x4, x5, outer_body
    ADD  x6, x7, x8
outer_body:
  SUBI  x9, x9, 1

Algorithic Examples: Multiplication with BN.MULQACC

The big number instruction subset of OTBN generally operates on WLEN bit numbers. BN.MULQACC operates with WLEN/4 bit operands (with a full WLEN accumulator). This section outlines two techniques to perform larger multiplies by composing multiple BN.MULQACC instructions.

Multiplying two WLEN/2 numbers with BN.MULQACC

This instruction sequence multiplies the lower half of w0 by the upper half of w0 placing the result in w1.

BN.MULQACC.Z      w0.0, w0.2, 0
BN.MULQACC        w0.0, w0.3, 64
BN.MULQACC        w0.1, w0.2, 64
BN.MULQACC.WO w1, w0.1, w0.3, 128

Multiplying two WLEN numbers with BN.MULQACC

The shift out functionality can be used to perform larger multiplications without extra adds. The table below shows how two registers w0 and w1 can be multiplied together to give a result in w2 and w3. The cells on the right show how the result is built up a0:a3 = w0.0:w0.3 and b0:b3 = w1.0:w1.3. The sum of a column represents WLEN/4 bits of a destination register, where c0:c3 = w2.0:w2.3 and d0:d3 = w3.0:w3.3. Each cell with a multiply in takes up two WLEN/4-bit columns to represent the WLEN/2-bit multiply result. The current accumulator in each instruction is represented by highlighted cells where the accumulator value will be the sum of the highlighted cell and all cells above it.

The outlined technique can be extended to arbitrary bit widths but requires unrolled code with all operands in registers.

d3 d2 d1 d0 c3 c2 c1 c0
BN.MULQACC.Z w0.0, w1.0, 0 a0 * b0
BN.MULQACC w0.1, w1.0, 64 a1 * b0
BN.MULQACC.SO w2.l, w0.0, w1.1, 64 a0 * b1
BN.MULQACC w0.2, w1.0, 0 a2 * b0
BN.MULQACC w0.1, w1.1, 0 a1 * b1
BN.MULQACC w0.0, w1.2, 0 a0 * b2
BN.MULQACC w0.3, w1.0, 64 a3 * b0
BN.MULQACC w0.2, w1.1, 64 a2 * b1
BN.MULQACC w0.1, w1.2, 64 a1 * b2
BN.MULQACC.SO w2.u, w0.0, w1.3, 64 a0 * b3
BN.MULQACC w0.3, w1.1, 0 a3 * b1
BN.MULQACC w0.2, w1.2, 0 a2 * b2
BN.MULQACC w0.1, w1.3, 0 a1 * b3
BN.MULQACC w0.3, w1.2, 64 a3 * b2
BN.MULQACC.SO w3.l, w0.2, w1.3, 64 a2 * b3
BN.MULQACC.SO w3.u, w0.3, w1.3, 0 a3 * b3

Code snippets giving examples of 256x256 and 384x384 multiplies can be found in sw/otbn/code-snippets/mul256.s and sw/otbn/code-snippets/mul384.s.

References

[CHEN08] L. Chen, “Hsiao-Code Check Matrices and Recursively Balanced Matrices,” arXiv:0803.1217 [cs], Mar. 2008 [Online]. Available: http://arxiv.org/abs/0803.1217

[POSIX18] The Open Group, “cksum” manual. Available: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/cksum.html