OTBN DV document

Goals

  • DV
    • Verify the OTBN processor by running dynamic simulations with a SV/UVM based testbench
    • These simulations are grouped in tests listed in the DV plan below.
    • Close code and functional coverage on the IP and all of its sub-modules
  • FPV
    • Verify TileLink device protocol compliance with an SVA based testbench

Current status

Design features

OTBN, the OpenTitan Big Number accelerator, is a cryptographic accelerator. For detailed information on OTBN design features, see the OTBN HWIP technical specification.

Testbench architecture

The OTBN testbench is based on the CIP testbench architecture. It builds on the dv_utils and csr_utils packages.

Block diagram

OTBN testing makes use of a DPI-based model called otbn_core_model. This is shown in the block diagram. The dotted interfaces in the otbn block are bound in by the model to access internal signals (register file and memory contents).

Block diagram

Top level testbench

The top-level testbench is located at hw/ip/otbn/dv/uvm/tb.sv. This instantiates the OTBN DUT module hw/ip/otbn/rtl/otbn.sv.

OTBN has the following interfaces:

The idle and interrupt signals are modelled with the basic pins_if interface.

As well as instantiating OTBN, the testbench also instantiates an otbn_core_model. This module wraps an ISS (instruction set simulator) subprocess and performs checks to make sure that OTBN behaves the same as the ISS. The model communicates with the testbench through an otbn_model_if interface, which is monitored by the otbn_model_agent, described below.

OTBN model agent

The model agent is instantiated by the testbench to monitor the OTBN model. It is a passive agent (essentially just a monitor): the inputs to the model are set in tb.sv. The monitor for the agent generates transactions when it sees a start signal or a done signal.

The start signal is important because we “cheat” and pull it out of the DUT. To make sure that the processor is starting when we expect, we check start transactions against TL writes in the scoreboard.

Reference models

The main reference model for OTBN is the instruction set simulator (ISS), which is run as a subprocess by DPI code inside otbn_core_model. This Python-based simulator can be found at hw/ip/otbn/dv/otbnsim.

Stimulus strategy

When testing OTBN, we are careful to distinguish between

  • behaviour that can be triggered by particular instruction streams
  • behaviour that is triggered by particular external stimuli (register writes; surprise resets etc.)

Testing lots of different instruction streams doesn’t really use the UVM machinery, so we have a “pre-DV” phase of testing that generates constrained-random instruction streams (as ELF binaries) and runs a simple block-level simulation on each to check that the RTL matches the model. The idea is that this is much quicker for designers to use to smoke-test proposed changes, and can be run with Verilator, so it doesn’t require an EDA tool licence. This pre-DV phase cannot drive sign-off, but it does use much of the same tooling.

Once we are running full DV tests, we re-use this work, by using the same collection of randomised instruction streams and randomly picking from them for most of the sequences. At the moment, the full DV tests create binaries on the fly by running hw/ip/otbn/dv/uvm/gen-binaries.py. This results in one or more ELF files in a directory, which the simulation then picks from at random.

The pre-DV testing doesn’t address external stimuli like resets or TileLink-based register accesses. These are driven by specialised test sequences, described below.

Test sequences

The test sequences can be found in hw/ip/otbn/dv/uvm/env/seq_lib. The basic test sequence (otbn_base_vseq) loads the instruction stream from a randomly chosen binary (see above), configures OTBN and then lets it run to completion.

More specialized sequences include things like multiple runs, register accesses during operation (which should fail) and memory corruption. We also check things like the correct operation of the interrupt registers.

Functional coverage

TODO: Functional coverage points are not yet defined.

Self-checking strategy

Scoreboard

Much of the checking for these tests is actually performed in otbn_core_model, which ensures that the RTL and ISS have the same behaviour. However, the scoreboard does have some checks, to ensure that interrupt and idle signals are high at the expected times.

Assertions

Core TLUL protocol assertions are checked by binding the TL-UL protocol checker into the design.

Outputs are also checked for 'X values by assertions in the design RTL. The design RTL contains other assertions defined by the designers, which will be checked in simulation (and won’t have been checked by the pre-DV Verilator simulations).

Finally, the otbn_idle_checker checks that the idle_o output correctly matches the running state that you’d expect, based on writes to the CMD register and responses that will appear in the DONE interrupt.

Building and running tests

Tests can be run with dvsim.py. The link gives details of the tool’s features and command line arguments. To run a basic smoke test, go to the top of the repository and run:

$ util/dvsim/dvsim.py hw/ip/otbn/dv/uvm/otbn_sim_cfg.hjson -i otbn_smoke

DV plan

Milestone Name Description Tests
V1 smoke

Smoke test, running a single fixed binary

This runs the binary from otbn/dv/smoke/smoke_test.s, which is designed to check most of the implemented instructions. The unchanging binary should mean this basic test is particularly appropriate for CI.

otbn_smoke
V1 single_binary

Run a single randomly-chosen binary

This test drives the main bulk of OTBN testing. It picks a random binary from a pre-generated set and runs it, comparing against the model. We'll run this with a large number of seeds and use functional coverage to track when verification of the internals of the core is done.

Sometimes enable the "done" interrupt to check that it and the error interrupt work correctly.

otbn_single
V1 csr_hw_reset

Verify the reset values as indicated in the RAL specification.

  • Write all CSRs with a random value.
  • Apply reset to the DUT as well as the RAL model.
  • Read each CSR and compare it against the reset value. it is mandatory to replicate this test for each reset that affects all or a subset of the CSRs.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
otbn_csr_hw_reset
V1 csr_rw

Verify accessibility of CSRs as indicated in the RAL specification.

  • Loop through each CSR to write it with a random value.
  • Read the CSR back and check for correctness while adhering to its access policies.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
otbn_csr_rw
V1 csr_bit_bash

Verify no aliasing within individual bits of a CSR.

  • Walk a 1 through each CSR by flipping 1 bit at a time.
  • Read the CSR back and check for correctness while adhering to its access policies.
  • This verify that writing a specific bit within the CSR did not affect any of the other bits.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
otbn_csr_bit_bash
V1 csr_aliasing

Verify no aliasing within the CSR address space.

  • Loop through each CSR to write it with a random value
  • Shuffle and read ALL CSRs back.
  • All CSRs except for the one that was written in this iteration should read back the previous value.
  • The CSR that was written in this iteration is checked for correctness while adhering to its access policies.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
otbn_csr_aliasing
V1 csr_mem_rw_with_rand_reset

Verify random reset during CSR/memory access.

  • Run csr_rw sequence to randomly access CSRs
  • If memory exists, run mem_partial_access in parallel with csr_rw
  • Randomly issue reset and then use hw_reset sequence to check all CSRs are reset to default value
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
otbn_csr_mem_rw_with_rand_reset
V1 mem_walk

Verify accessibility of all memories in the design.

  • Run the standard UVM mem walk sequence on all memories in the RAL model.
  • It is mandatory to run this test from all available interfaces the memories are accessible from.
otbn_mem_walk
V1 mem_partial_access

Verify partial-accessibility of all memories in the design.

  • Do partial reads and writes into the memories and verify the outcome for correctness.
  • Also test outstanding access on memories
otbn_mem_partial_access
V2 reset_recovery

Run two binaries, resetting the first at an arbitrary time

Running another binary after a sudden and unexpected reset via the rst_ni signal will check that all state is properly re-initialized after a reset. We'd expect X-propagation checks to catch most problems like this, but an explicit reset sequence also adds the relevant FSM/toggle coverage.

V2 mem_integrity

Inject ECC errors into DMEM and IMEM and expect an alert

V2 back_to_back

Run sequences back-to-back

This runs several sequences back-to-back, without resets between them. This should catch initialisation problems where not all state is cleared between programs when there's no reset.

otbn_multi
V2 stress_all

Run multiple sequences back-to-back while making invalid TL accesses at the same time.

V2 intr_test

Verify common intr_test CSRs that allows SW to mock-inject interrupts.

  • Enable a random set of interrupts by writing random value(s) to intr_enable CSR(s).
  • Randomly "turn on" interrupts by writing random value(s) to intr_test CSR(s).
  • Read all intr_state CSR(s) back to verify that it reflects the same value as what was written to the corresponding intr_test CSR.
  • Check the cfg.intr_vif pins to verify that only the interrupts that were enabled and turned on are set.
  • Clear a random set of interrupts by writing a randomly value to intr_state CSR(s).
  • Repeat the above steps a bunch of times.
otbn_intr_test
V2 tl_d_oob_addr_access

Access out of bounds address and verify correctness of response / behavior

otbn_tl_errors
V2 tl_d_illegal_access

Drive unsupported requests via TL interface and verify correctness of response / behavior. Below error cases are tested

  • TL-UL protocol error cases
    • Unsupported opcode. e.g a_opcode isn't Get, PutPartialData or PutFullData
    • Mask isn't all active if opcode = PutFullData
    • Mask isn't in enabled lanes, e.g. a_address = 0x00, a_size = 0, a_mask = 'b0010
    • Mask doesn't align with address, e.g. a_address = 0x01, a_mask = 'b0001
    • Address and size aren't aligned, e.g. a_address = 0x01, a_size != 0
    • Size is over 2.
  • OpenTitan defined error cases
    • Access unmapped address, return d_error = 1 when devmode_i == 1
    • Write CSR with unaligned address, e.g. a_address[1:0] != 0
    • Write CSR less than its width, e.g. when CSR is 2 bytes wide, only write 1 byte
    • Write a memory without enabling all lanes (a_mask = '1) if memory doesn't support byte enabled write
    • Read a WO (write-only) memory
otbn_tl_errors
V2 tl_d_outstanding_access

Drive back-to-back requests without waiting for response to ensure there is one transaction outstanding within the TL device. Also, verify one outstanding when back- to-back accesses are made to the same address.

otbn_csr_hw_reset
otbn_csr_rw
otbn_csr_aliasing
otbn_same_csr_outstanding
V2 tl_d_partial_access

Access CSR with one or more bytes of data For read, expect to return all word value of the CSR For write, enabling bytes should cover all CSR valid fields

otbn_csr_hw_reset
otbn_csr_rw
otbn_csr_aliasing
otbn_same_csr_outstanding