FLASH_CTRL DV document

Goals

  • DV
    • Verify all flash_ctrl IP features by running dynamic simulations with a SV/UVM based testbench
    • Develop and run all tests based on the testplan below towards closing code and functional coverage on the IP and all of its sub-modules
  • FPV
    • Verify TileLink device protocol compliance with an SVA based testbench

Current status

Design features

For detailed information on flash_ctrl design features, please see the flash_ctrl HWIP technical specification. The design-under-test (DUT) wraps the flash_ctrl IP, flash_phy and the TLUL SRAM adapter that converts the incoming TL accesses from the from host (CPU) interface into flash requests. These modules are instantiated and connected to each other and to the rest of the design at the top level. For the IP level DV, we replicate the instantiations and connections in flash_ctrl_wrapper module mainained in DV, located at hw/ip/flash_ctrl/dv/tb/flash_ctrl_wrapper.sv. In future, we will consider having the wrapper maintained in the RTL area instead.

Testbench architecture

The flash_ctrl UVM DV testbench has been constructed based on the CIP testbench architecture.

Block diagram

Block diagram

Top level testbench

Top level testbench is located at hw/ip/flash_ctrl/dv/tb/tb.sv. It instantiates the flash_ctrl_wrapper. In addition, the testbench instantiates the following interfaces, connects them to the DUT and sets their handle into uvm_config_db:

In future, as the design (and DV) matures, the following interfaces will be instantiated and hooked up to the DUT:

  • Secret key interface from the OTP
  • Interface to the key_mgr
  • Interface from the life cycle manager

Common DV utility components

The following utilities provide generic helper tasks and functions to perform activities that are common across the project:

TL_agent

flash_ctrl UVM environment instantiates a (already handled in CIP base env) tl_agent which provides the ability to drive and independently monitor random traffic via TL host interface into flash_ctrl device. There is an additional instance of the tl_agent for the host interface to the flash_phy, to directly fetch the contents of the flash memory, bypassing the flash_ctrl.

The tl_agent monitor supplies partial TileLink request packets as well as completed TileLink response packets over the TLM analysis port for further processing within the flash_ctrl scoreboard.

UVM RAL Model

The flash_ctrl RAL model is created with the ralgen FuseSoC generator script automatically when the simulation is at the build stage.

It can be created manually by invoking regtool:

Sequence cfg

An efficient way to develop test sequences is by providing some random varibles that are used to configure the DUT / drive stimulus. The random variables are constrained using weights and knobs that can be controlled. These weights and knobs take on a “default” value that will result in the widest exploration of the design state space, when the test sequence is randomized and run as-is. To steer the randomization towards a particular distribution or to achieve interesting combinations of the random variables, the test sequence can be extended to create a spacialized variant. In this extended sequence, nothing would need to be done, other than setting those weights and knobs appropriately. This helps increase the likelihood of hitting the design corners that would otherwise be difficult to achieve, while maximizing reuse.

This object aims to provide such run-time controls. An example of such a knob is num_en_mp_regions, which controls how many flash memory protection regions to configure, set to ‘all’ by default.

Env cfg

The flash_ctrl_env_cfg, environment configuration object provides access to the following elements:

  • Build-time controls to configure the UVM environment composition during the build_phase
  • Downstream agent configuration objects for ease of lookup from any environment component
    • This includes the tl_agent_cfg objects for both TL interfaces
  • All virtual interfaces that connect to the DUT listed above (retrieved from the uvm_config_db)
  • Sequence configuration object described above

All environment components contain a handle to an instance of this class (that was created in the test class via the parent dv_base_test). By housing all of the above, all pertinent information is more easily shared with all environment components.

Stimulus strategy

Test sequences

All test sequences reside in hw/ip/flash_ctrl/dv/env/seq_lib. The flash_ctrl_base_vseq virtual sequence is extended from cip_base_vseq and serves as a starting point. All test sequences are extended from flash_ctrl_base_vseq. It provides commonly used handles, variables, functions and tasks that the test sequences can simple use / call. Some of the most commonly used tasks / functions are as follows:

  • task 1:
  • task 2:

Functional coverage

To ensure high quality constrained random stimulus, it is necessary to develop a functional coverage model. The following covergroups have been developed to prove that the test intent has been adequately met:

  • cg1:
  • cg2:

Self-checking strategy

Scoreboard

The flash_ctrl_scoreboard is primarily used for end to end checking. It creates the following analysis ports to retrieve the data monitored by corresponding interface agents:

  • analysis port1:
  • analysis port2:

Assertions

  • TLUL assertions: The tb/flash_ctrl_bind.sv binds the tlul_assert assertions to the IP to ensure TileLink interface protocol compliance.
  • Unknown checks on DUT outputs: The RTL has assertions to ensure all outputs are initialized to known values after coming out of reset.
  • assert prop 1:
  • assert prop 2:

Global types & methods

All common types and methods defined at the package level can be found in flash_ctrl_env_pkg. Some of them in use are:

[list a few parameters, types & methods; no need to mention all]

Building and running tests

We are using our in-house developed regression tool for building and running our tests and regressions. Please take a look at the link for detailed information on the usage, capabilities, features and known issues. Here’s how to run a smoke test:

$ cd $REPO_TOP
$ ./util/dvsim/dvsim.py hw/ip/flash_ctrl/dv/flash_ctrl_sim_cfg.hjson -i flash_ctrl_smoke

Testplan

Testpoints

Milestone Name Tests Description
V1 smoke flash_ctrl_smoke

Randomly read, program or erase (page or a bank) a randomized chunk of flash memory. Only the data partition is accessed. No extra features enabled. Flash memory is invalidated and the targeted chunk is initialized with random data for reads and all 1s for writes. Interrupts are not enabled, Completion is ascertained through polling. The success of each operation is verified via backdoor.

V1 smoke_hw flash_ctrl_smoke_hw

Perform host direct read on the single page of Data partition. First Flash memory is initialized with random values and then it is being read directly by Host interface. Finally, backdoor read is used for checking read data.

V1 csr_hw_reset flash_ctrl_csr_hw_reset

Verify the reset values as indicated in the RAL specification.

  • Write all CSRs with a random value.
  • Apply reset to the DUT as well as the RAL model.
  • Read each CSR and compare it against the reset value. it is mandatory to replicate this test for each reset that affects all or a subset of the CSRs.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
V1 csr_rw flash_ctrl_csr_rw

Verify accessibility of CSRs as indicated in the RAL specification.

  • Loop through each CSR to write it with a random value.
  • Read the CSR back and check for correctness while adhering to its access policies.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
V1 csr_bit_bash flash_ctrl_csr_bit_bash

Verify no aliasing within individual bits of a CSR.

  • Walk a 1 through each CSR by flipping 1 bit at a time.
  • Read the CSR back and check for correctness while adhering to its access policies.
  • This verify that writing a specific bit within the CSR did not affect any of the other bits.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
V1 csr_aliasing flash_ctrl_csr_aliasing

Verify no aliasing within the CSR address space.

  • Loop through each CSR to write it with a random value
  • Shuffle and read ALL CSRs back.
  • All CSRs except for the one that was written in this iteration should read back the previous value.
  • The CSR that was written in this iteration is checked for correctness while adhering to its access policies.
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
  • Shuffle the list of CSRs first to remove the effect of ordering.
V1 csr_mem_rw_with_rand_reset flash_ctrl_csr_mem_rw_with_rand_reset

Verify random reset during CSR/memory access.

  • Run csr_rw sequence to randomly access CSRs
  • If memory exists, run mem_partial_access in parallel with csr_rw
  • Randomly issue reset and then use hw_reset sequence to check all CSRs are reset to default value
  • It is mandatory to run this test for all available interfaces the CSRs are accessible from.
V1 mem_walk flash_ctrl_mem_walk

Verify accessibility of all memories in the design.

  • Run the standard UVM mem walk sequence on all memories in the RAL model.
  • It is mandatory to run this test from all available interfaces the memories are accessible from.
V1 mem_partial_access flash_ctrl_mem_partial_access

Verify partial-accessibility of all memories in the design.

  • Do partial reads and writes into the memories and verify the outcome for correctness.
  • Also test outstanding access on memories
V1 shadow_reg_update_error flash_ctrl_shadow_reg_errors

Verify shadowed registers' update error.

  • Randomly pick a shadowed register in the DUT.
  • Write it twice with different values.
  • Verify that the update error alert is triggered and the register value remains unchanged.
  • Verify the update_error status register field is set to 1.
  • Repeat the above steps a bunch of times.
V1 shadow_reg_read_clear_staged_value flash_ctrl_shadow_reg_errors

Verify reading a shadowed register will clear its staged value.

  • Randomly pick a shadowed register in the DUT.
  • Write it once and read it back to clear the staged value.
  • Then write it twice with the same new value (but different from the previous step).
  • Read it back to verify the new value and ensure that the update error alert did not trigger.
  • Verify the update_error status register field remains the same value.
  • Repeat the above steps a bunch of times.
V1 shadow_reg_storage_error flash_ctrl_shadow_reg_errors

Verify shadowed registers' storage error.

  • Randomly pick a shadowed register in the DUT.
  • Backdoor write to shadowed or committed flops to create a storage fatal alert.
  • Check if fatal alert continuously fires until reset.
  • Verify that all other frontdoor write attempts are blocked during the storage error.
  • Verify that storage_error status register field is set to 1.
  • Reset the DUT.
  • Read all CSRs to ensure the DUT is properly reset.
  • Repeat the above steps a bunch of times.
V1 shadowed_reset_glitch flash_ctrl_shadow_reg_errors

Verify toggle shadowed_rst_n pin can trigger storage error.

  • Randomly drive shadowed_rst_n pin to low or rst_n pin to low.
  • check if any registers have been written before the reset. If so check if storage error fatal alert is triggered.
  • Check status register.
  • Drive shadowed_rst_n pin or rst_n pin back to high.
  • If fatal alert is triggered, reset the DUT.
  • Read all CSRs to ensure the DUT is properly reset.
  • Repeat the above steps a bunch of times.
V1 shadow_reg_update_error_with_csr_rwflash_ctrl_shadow_reg_errors_with_csr_rw

Run shadow_reg_update_error sequence in parallel with csr_rw sequence.

  • Randomly select one of the above sequences.
  • Apply csr_rw sequence in parallel but disable the csr_access_abort to ensure all shadowed registers' write/read to be executed without aborting.
  • Repeat the above steps a bunch of times.
V2 sw_op flash_ctrl_sw_op

Perform flash protocol controller read, program and erase on the single page of one bank within Data partition. Finally perform read on same location in order to test if previous operation was done successfully.

V2 host_read_direct

Perform back-to-back direct reads via Host in order to test bandwidth of hardware host interface. In addition, perform stalls to test pipeline structure. Enable scramble to test pipeline structure.

V2 host_ctrl_hw_if

Perform read of seed materials via Hardware Interface of the Flash Protocol Controller. Perform RMA entry request and check afterwards that software has no granted access. During RMA entry, check if content of flash is wiped out via Host Interface direct reads.

V2 host_controller_arb

Perform back-to-back operations via Software Interface and via Hardware Interface(key manager,life cycle) in order to test arbitration within Flash Protocol Controller.

V2 erase_suspend

Perform erase suspend when erase is ongoing and also when erase is not ongoing. Check if request is immediately cleared in case when no erase is ongoing. Check if request is cleared in case when suspend is handled. Read affected bank in order to verify erase suspension feature.

V2 full_memory_access

Entire memory is accessed by Controller and directly by Host. In addition, Data partitions can be directly read by Software(Flash controller) and hardware hosts, while Info partitions can be read only by the Flash controller. Test High Endurance feature.

V2 fifo_eviction

Perform following sequences of operations: read/program/read and read/erase/read in order to test fifo eviction properly. Read should be executed by both Software and Host interface. All combinations should be tested. Covergroup for this hazardous behavior is fifo_evict_cg.

V2 host_arb

Test arbitration within Flash Physical Controller by reading from both interfaces at the same time. Perform continuously direct read data from host interface and at the same time, perform all operations READ/PROGRAM/ERASE from the flash controller is in progress. Perform parallel operations at addresses of different banks and also on same bank. Expect that operations are successfully executed.

V2 host_interleave

At same time, perform two read operations and the same time via hardware host and via controller. At same time, perform read operation via hardware host and program operation via controller. Perform mentioned parallel operations at different addresses and on the same address (i.e. seed material). Expect that operations are successfully executed.

V2 memory_protection

Perform READ/ PROGRAM/ ERASE operations over protected regions and pages of data and info partitions. Use set and reset values of corresponding read, program and erase enable bits. Test boundary values of regions. Test overlap of regions in which lower region wins arbitration. Covergroup is region_range_cg.

V2 all_partitions flash_ctrl_rand_ops

Sanity + both, legal data and info partitions are accessed. In future, support for multiple info partitions may be added - those will be covered as well.

V2 error_oob

Perform accesses in order to provoke error out of bound value. Test both, Software interface and Hardware interface. Related covergroup is error_cg.

V2 error_mp

Perform accesses in order to provoke memory permission error. Test both, Software interface and Hardware interface. Related covergroup is error_cg.

V2 error_rd

Perform accesses in order to provoke read data error. Test both, Software interface and Hardware interface. Related covergroup is error_cg.

V2 error_prog_win

Perform accesses in order to provoke program resolution error. Test both, Software interface and Hardware interface. Related covergroup is error_cg.

V2 error_prog_type

Perform accesses in order to provoke program type error. Test both, Software interface and Hardware interface. Related covergroup is error_cg.

V2 error_flash_phy

Perform accesses in order to provoke native flash error. Test both, Software interface and Hardware interface. Related covergroup is error_cg.

V2 error_lc

Perform accesses in order to provoke life cycle management interface error. Related covergroup is error_cg.

V2 secret_partition

Verify values of secret information partitions. Enabling by life cycle and otp is required. Pages are read upon flash controller initialization.

V2 isolation_partition

Verify values of isolated information partitions. Accessablity is controlled by life cycle and otp.

V2 interrupts

Perform accesses in order to raise all interrupts given in register map. Check behaviour of Interrupt Enable and Status Registers.

V2 ecc

Randomly enable ECC for a randomly selected set of pages. Randomly corrupt a single bit in the memory that is about to be read and ensure that the ECC works - the corrupted bit should be fixed. Corrupt randomly either the data or the ECC bits. Randomly corrupt 2 bits in the same word and ensure that the read results in error. Ensure that pages with ECC not enabled reads back corrupted data without any errors. Verify both types, pre-scramble ECC(integrity ESS, 4-bits) and post-scramble ECC(reliability ECC, 8-bits). Test status and control ECC bits.

V2 alert_test flash_ctrl_alert_test

Verify common alert_test CSR that allows SW to mock-inject alert requests.

  • Enable a random set of alert requests by writing random value to alert_test CSR.
  • Check each alert_tx.alert_p pin to verify that only the requested alerts are triggered.
  • During alert_handshakes, write alert_test CSR again to verify that: If alert_test writes to current ongoing alert handshake, the alert_test request will be ignored. If alert_test writes to current idle alert handshake, a new alert_handshake should be triggered.
  • Wait for the alert handshakes to finish and verify alert_tx.alert_p pins all sets back to 0.
  • Repeat the above steps a bunch of times.
V2 intr_test flash_ctrl_intr_test

Verify common intr_test CSRs that allows SW to mock-inject interrupts.

  • Enable a random set of interrupts by writing random value(s) to intr_enable CSR(s).
  • Randomly "turn on" interrupts by writing random value(s) to intr_test CSR(s).
  • Read all intr_state CSR(s) back to verify that it reflects the same value as what was written to the corresponding intr_test CSR.
  • Check the cfg.intr_vif pins to verify that only the interrupts that were enabled and turned on are set.
  • Clear a random set of interrupts by writing a randomly value to intr_state CSR(s).
  • Repeat the above steps a bunch of times.
V2 tl_d_oob_addr_access flash_ctrl_tl_errors

Access out of bounds address and verify correctness of response / behavior

V2 tl_d_illegal_access flash_ctrl_tl_errors

Drive unsupported requests via TL interface and verify correctness of response / behavior. Below error cases are tested bases on the [TLUL spec]({{< relref "hw/ip/tlul/doc/_index.md#explicit-error-cases" >}})

  • TL-UL protocol error cases
    • invalid opcode
    • some mask bits not set when opcode is PutFullData
    • mask does not match the transfer size, e.g. a_address = 0x00, a_size = 0, a_mask = 'b0010
    • mask and address misaligned, e.g. a_address = 0x01, a_mask = 'b0001
    • address and size aren't aligned, e.g. a_address = 0x01, a_size != 0
    • size is greater than 2
  • OpenTitan defined error cases
    • access unmapped address, expect d_error = 1 when devmode_i == 1
    • write a CSR with unaligned address, e.g. a_address[1:0] != 0
    • write a CSR less than its width, e.g. when CSR is 2 bytes wide, only write 1 byte
    • write a memory with a_mask != '1 when it doesn't support partial accesses
    • read a WO (write-only) memory
    • write a RO (read-only) memory
V2 tl_d_outstanding_access flash_ctrl_csr_hw_reset
flash_ctrl_csr_rw
flash_ctrl_csr_aliasing
flash_ctrl_same_csr_outstanding

Drive back-to-back requests without waiting for response to ensure there is one transaction outstanding within the TL device. Also, verify one outstanding when back- to-back accesses are made to the same address.

V2 tl_d_partial_access flash_ctrl_csr_hw_reset
flash_ctrl_csr_rw
flash_ctrl_csr_aliasing
flash_ctrl_same_csr_outstanding

Access CSR with one or more bytes of data. For read, expect to return all word value of the CSR. For write, enabling bytes should cover all CSR valid fields.

V2S tl_intg_err flash_ctrl_tl_intg_err

Verify that the data integrity check violation generates an alert.

Randomly inject errors on the control, data, or the ECC bits during CSR accesses. Verify that triggers the correct fatal alert.

V3 scramble

Enable scrambling, along with randomized scramble keys. Program a fresh chunk of memory and read back (both, via controller and host) to ensure data integrity. On program, verify via backdoor scrambling was done on the raw data correctly. When reading via host, read the same memory via host multiple times back-to-back and ensure the timing is correct (subsequent reads should be faster). When scrambling is not enabled, ensure that the raw data is written and read back.

V3 robustness

Enable full randomization in order to fully stress DUT. Perform illegal accesses in order to gain robustness.

V3 stress_all_with_rand_reset flash_ctrl_stress_all_with_rand_reset

This test runs 3 parallel threads - stress_all, tl_errors and random reset. After reset is asserted, the test will read and check all valid CSR registers.

Covergroups

Name Description
control_cg

Covers that all operations READ/PROGRAM/ERASE/UNKNOWN have been tested. Covers that ERASE operation is performed on a page and on entire bank. Covers data and info partition selection. Covers types of informational partitions. Covers if request of erase suspension occured. Covers High Endurance feature. Covers scramble feature. Covers ECC feature. All valid combinations of the above will also be crossed.

error_cg

Covers following error scenarios given in Flash error code register:

  • oob_err: The supplied address ADDR is invalid and out of bounds.
  • mp_err: Flash access has encountered an access permission error.
  • rd_err: Flash read has an uncorrectable data error.
  • prog_win_err: Flash program has a window resolution error.
  • prog_type_err: Flash program selected unavailable type.
  • flash_phy_err: The flash access encountered a native flash error. Covers following error scenarios given in Flash fault status register:
  • oob_err: The flash hardware interface supplied an out of bound value.
  • mp_err: The flash hardware interface encountered a memory permission error.
  • rd_err: The flash hardware interface encountered a read data error.
  • prog_win_err: The flash hardware interface encountered a program resolution error.
  • prog_type_err: The flash hardware interface encountered a program type error.
  • flash_phy_err: The flash hardware interface encountered a native flash error.
  • lcmgr_err: The life cycle management interface has encountered a fatal error.
fifo_evict_cg

Covers that all possible combinations for following sequences of operations READ/PROGRAM/READ and READ/ERASE/READ are executed. Software Interface can perform all three operations READ/PROGRAM/ERASE while Host Interface can perform direct READ.

flash_words_len_cg

Covers number of flash words for operations READ/PROGRAM. The minimum tested number of words is 0 and the maximum length is 2^12-1. Cover an acceptable distribution of lengths has been seen including corner cases (length 0 and length 2^12-1).

msgfifo_level_cg

Covers that all possible fifo statuses generate interrupts for operations READ/PROGRAM. Covers both boundary values 0 and 31. Also covers acceptable distributions within ranges.

region_range_cg

Covers all possible numbers of region base pages abd region size pages which are given in pages. Cover an acceptable distribution of lengths has been seen including corner cases (value 0 and maximum value for which it makes sense to test it).

tl_errors_cg

Cover the following error cases on TL-UL bus:

  • TL-UL protocol error cases.
  • OpenTitan defined error cases, refer to testpoint tl_d_illegal_access.
tl_intg_err_cg

Cover all kinds of integrity errors (command, data or both) and cover number of error bits on each integrity check.

Cover the kinds of integrity errors with byte enabled write on memory if applicable: Some memories store the integrity values. When there is a subword write, design re-calculate the integrity with full word data and update integrity in the memory. This coverage ensures that memory byte write has been issued and the related design logic has been verfied.