OTBN, the OpenTitan Big Number accelerator, is a cryptographic accelerator. For detailed information on OTBN design features, see the [OTBN HWIP technical specification]({{< relref “..” >}}).
The OTBN testbench is based on the [CIP testbench architecture]({{< relref “hw/dv/sv/cip_lib/doc” >}}). It builds on the [dv_utils]({{< relref “hw/dv/sv/dv_utils/README.md” >}}) and [csr_utils]({{< relref “hw/dv/sv/csr_utils/README.md” >}}) packages.
OTBN testing makes use of a DPI-based model called otbn_core_model
. This is shown in the block diagram. The dotted interfaces in the otbn
block are bound in by the model to access internal signals (register file and memory contents).
The top-level testbench is located at hw/ip/otbn/dv/uvm/tb.sv
. This instantiates the OTBN DUT module hw/ip/otbn/rtl/otbn.sv
.
OTBN has the following interfaces:
idle_o
The idle and interrupt signals are modelled with the basic [pins_if
]({{< relref “hw/dv/sv/common_ifs#pins_if” >}}) interface.
As well as instantiating OTBN, the testbench also instantiates an otbn_core_model
. This module wraps an ISS (instruction set simulator) subprocess and performs checks to make sure that OTBN behaves the same as the ISS. The model communicates with the testbench through an otbn_model_if
interface, which is monitored by the otbn_model_agent
, described below.
The model agent is instantiated by the testbench to monitor the OTBN model. It is a passive agent (essentially just a monitor): the inputs to the model are set in tb.sv
. The monitor for the agent generates transactions when it sees a start signal or a done signal.
The start signal is important because we “cheat” and pull it out of the DUT. To make sure that the processor is starting when we expect, we check start transactions against TL writes in the scoreboard.
The main reference model for OTBN is the instruction set simulator (ISS), which is run as a subprocess by DPI code inside otbn_core_model
. This Python-based simulator can be found at hw/ip/otbn/dv/otbnsim
.
When testing OTBN, we are careful to distinguish between
Testing lots of different instruction streams doesn‘t really use the UVM machinery, so we have a “pre-DV” phase of testing that generates constrained-random instruction streams (as ELF binaries) and runs a simple block-level simulation on each to check that the RTL matches the model. The idea is that this is much quicker for designers to use to smoke-test proposed changes, and can be run with Verilator, so it doesn’t require an EDA tool licence. This pre-DV phase cannot drive sign-off, but it does use much of the same tooling.
Once we are running full DV tests, we re-use this work, by using the same collection of randomised instruction streams and randomly picking from them for most of the sequences. At the moment, the full DV tests create binaries on the fly by running hw/ip/otbn/dv/uvm/gen-binaries.py
. This results in one or more ELF files in a directory, which the simulation then picks from at random.
The pre-DV testing doesn't address external stimuli like resets or TileLink-based register accesses. These are driven by specialised test sequences, described below.
The test sequences can be found in hw/ip/otbn/dv/uvm/env/seq_lib
. The basic test sequence (otbn_base_vseq
) loads the instruction stream from a randomly chosen binary (see above), configures OTBN and then lets it run to completion.
More specialized sequences include things like multiple runs, register accesses during operation (which should fail) and memory corruption. We also check things like the correct operation of the interrupt registers.
We distinguish between architectural and micro-architectural functional coverage. The idea is that the points that go into architectural coverage are those that a DV engineer could derive by reading the block specification. The points that go into micro-architectural coverage are those that require knowledge of the block‘s micro-architecture. Some of these will come from DV engineers; others from the block’s designers. These two views are complementary and will probably duplicate coverage points. For example, an architectural coverage point might be “the processor executed ADDI
and the result overflowed”. This might overlap with something like “the overflow
signal in the ALU was true when adding”.
The [call stack]({{< relref “.#call-stack” >}}) is exposed as a special register behind x1
. It has a bounded depth of 8 elements. We expect to see the following events:
x1
All four of these events should be crossed with the three states of the call stack: empty, partially full, and full.
The [loop stack]({{< relref “.#loop-stack” >}}) is accessed by executing LOOP
and LOOPI
instructions. Important events for it are tracked at those instructions, rather than separately.
Each flag in each flag group should be set to one from zero by some instruction. Similarly, each flag in each flag group should be cleared to zero from one by some instruction.
As a processor, much of OTBN‘s coverage points are described in terms of instructions being executed. Because OTBN doesn’t have a complicated multi-stage pipeline or any real exception handling, we don't track much temporal information (such as sequences of instructions).
As well as instruction-specific coverage points detailed below, we include a requirement that each instruction is executed at least once.
For any instruction with one or more immediate fields, we require “toggle coverage” for those fields. That is, we expect to see execution with each bit of each immediate field being zero and one. We also expect to see each field with values '0
and '1
(all zeros and all ones). If the field is treated as a signed number, we also expect to see it with the extremal values for its range (just the MSB set, for the most negative value; all but the MSB set, for the most positive value).
The code to track this is split by encoding schema in
otbn_env_cov
. Each instruction listed below will specify its encoding schema. Each encoding schema then has its own covergroup. Rather than tracking toggle coverage as described above, we just track extremal values in a coverpoint. This also implies toggle coverage for both signed and unsigned fields. For unsigned fields of widthN
, the extremal values are0
and(1 << N) - 1
, represented by the bits'0
and'1
respectively. For signed fields of widthN+1
, the extremal values are-(1 << N)
and(1 << N) - 1
. These are represented by{1'b1, {N{1'b0}}}
and{1'b0, {N{1'b1}}}
: again, these toggle all the bits. For example,beq
uses theB
schema, which then maps to theenc_b_cg
covergroup. This encoding schema'sOFF
field is tracked with theoff_cp
coverpoint. Finally, the relevant cross is calledoff_cross
.
For any instruction that reads from or writes to a GPR, we expect to see that operand equal to x0
, x1
and an arbitrary register in the range x2 .. x31
. We don't have any particular coverage requirements for WDRs (since all of them work essentially the same).
As for immediates, the code to track this is split by encoding schema in
otbn_env_cov
. Each register field gets a coverpoint with the same name, defined with theDEF_GPR_CP
helper macro. If the encoding schema has more than one instruction, the coverpoint is then crossed with the mnemonic, using theDEF_MNEM_CROSS
helper macro. For example,add
is in theenc_bnr_cg
covergroup. This encoding schema'sGRD
field is tracked with thegrd_cp
coverpoint. Finally, the relevant cross is calledgrd_cross
.
For any source GPR or WDR, we require “toggle coverage” for its value. For example, ADD
reads from its grs1
operand. We want to see each of the 32 bits of that operand set and unset (giving 64 coverage points). Similarly, BN.ADD
reads from its wrs1
operand. We want to see each of the 256 bits of that operand set and unset (giving 512 coverage points).
Again, the code to track this is split by encoding schema in
otbn_env_cov
. The trace interface takes a copy of GPR and WDR read data. The relevant register read data are then passed to the encoding schema‘s covergroup in theon_insn
method. To avoid extremely repetitive code, the actual coverpoints and crosses are defined with the help of macros. The coverpoints are named with the base-2 expansion of the bit in question. For example, the cross in theenc_bnr_cg
that tracks whether we’ve seen both values of bit 12 for thegrs1
operand is calledgrs1_01100_cross
(since 12 is5'b01100
).
If an instruction can generate flag changes, we expect to see each flag that the instruction can change being both set and cleared by the instruction. This needn‘t be crossed with the two flag groups (that’s tracked separately in the “Flags” block above). For example, BN.ADD
can write to each of the flags C
, M
, L
and Z
. This paragraph implies eight coverage points (four flags times two values) for that instruction.
Again, the code to track this is split by encoding schema in
otbn_env_cov
. The trace interface takes a copy of flag write data. It doesn't bother storing the flag write flags, since these are implied by the instruction anyway. There is a coverage coverpoint tracking both values for each of the flags that can be written. This is then crossed with the instruction mnemonic. For example, the coverpoint for the C flag (bit zero) in thebnaf
encoding used byBN.ADD
is calledflags_00_cp
. Some instructions only write theM
,L
andZ
flags. These are found in thebna
,bnan
,bnaqs
andbnaqw
encoding groups. For these instructions, we only track bits1
,2
and3
of the flags structure.
This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_addsub_cg
(also used for SUB
).
sign_a_sign_b_cross
.This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_addi_cg
.
sign_cross
.This instruction uses the U
encoding schema, with covergroup enc_u_cg
. There are no further coverage points.
This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_addsub_cg
.
sign_a_sign_b_cross
.This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_sll_cg
.
nz_by_z_cp
.0x1f
which leaves the top bit set. Tracked as shift15_cp
.This instruction uses the Is
encoding schema, with covergroup enc_is_cg
. The instruction-specific covergroup is insn_slli_cg
.
nz_by_z_cp
.0x1f
which leaves the top bit set. Tracked as shift15_cp
.This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_srl_cg
.
nz_by_z_cp
.0x1f
which leaves the bottom bit set. Tracked as shift15_cp
. Note that this point also checks that we're performing a logical, rather than arithmetic, right shift.This instruction uses the Is
encoding schema, with covergroup enc_is_cg
. The instruction-specific covergroup is insn_srli_cg
.
nz_by_z_cp
.0x1f
which leaves the bottom bit set. Tracked as shift15_cp
. Note that this point also checks that we're performing a logical, rather than arithmetic, right shift.This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_sra_cg
.
nz_by_z_cp
.0x1f
which leaves the bottom bit set. Tracked as shift15_cp
. Note that this point also checks that we're performing an arithmetic, rather than logical, right shift.This instruction uses the Is
encoding schema, with covergroup enc_is_cg
. The instruction-specific covergroup is insn_srai_cg
.
nz_by_z_cp
.0x1f
which leaves the bottom bit set. Tracked as shift15_cp
. Note that this point also checks that we're performing an arithmetic, rather than logical, right shift.This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_log_binop_cg
(shared with other logical binary operations).
x0
(to ensure we‘re not just AND’ing things with zero) Tracked as write_data_XXXXX_cross
, where XXXXX
is the base-2 representation of the bit being checked.This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_log_binop_cg
(shared with other logical binary operations).
x0
(to ensure we‘re not just AND’ing things with zero) Tracked as write_data_XXXXX_cross
, where XXXXX
is the base-2 representation of the bit being checked.This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_log_binop_cg
(shared with other logical binary operations).
x0
(to ensure we‘re not just OR’ing things with '1
) Tracked as write_data_XXXXX_cross
, where XXXXX
is the base-2 representation of the bit being checked.This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_log_binop_cg
(shared with other logical binary operations).
x0
(to ensure we‘re not just OR’ing things with '1
) Tracked as write_data_XXXXX_cross
, where XXXXX
is the base-2 representation of the bit being checked.This instruction uses the R
encoding schema, with covergroup enc_r_cg
. The instruction-specific covergroup is insn_log_binop_cg
(shared with other logical binary operations).
x0
(to ensure we‘re not just XOR’ing things with zero) Tracked as write_data_XXXXX_cross
, where XXXXX
is the base-2 representation of the bit being checked.This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_log_binop_cg
(shared with other logical binary operations).
x0
(to ensure we‘re not just XOR’ing things with zero) Tracked as write_data_XXXXX_cross
, where XXXXX
is the base-2 representation of the bit being checked.This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_xw_cg
(shared with SW
).
<grs1>
is above the top of memory and a negative <offset>
brings the load address in range. Tracked as oob_base_neg_off_cp
.<grs1>
is negative and a positive <offset>
brings the load address in range. Tracked as neg_base_pos_off_cp
.addr0_cp
.top_addr_cp
.oob_addr_cp
.barely_oob_addr_cp
.grs1
and offset
. Tracked as align_cross
.This instruction uses the I
encoding schema, with covergroup enc_s_cg
. The instruction-specific covergroup is insn_xw_cg
(shared with LW
).
<grs1>
is above the top of memory and a negative <offset>
brings the load address in range. Tracked as oob_base_neg_off_cp
.<grs1>
is negative and a positive <offset>
brings the load address in range. Tracked as neg_base_pos_off_cp
.addr0_cp
.top_addr_cp
.oob_addr_cp
.barely_oob_addr_cp
.grs1
and offset
. Tracked as align_cross
.This instruction uses the B
encoding schema, with covergroup enc_b_cg
. The instruction-specific covergroup is insn_bxx_cg
(shared with BNE
).
All points should be crossed with branch taken / branch not taken.
eq_dir_cross
.eq_offset_align_cross
.eq_oob_cross
.eq_neg_cross
.The “branch to current address” item is problematic if we want to take the branch. Probably we need some tests with short timeouts to handle this properly.
This instruction uses the B
encoding schema, with covergroup enc_b_cg
. The instruction-specific covergroup is insn_bxx_cg
(shared with BEQ
).
All points should be crossed with branch taken / branch not taken.
eq_dir_cross
.eq_offset_align_cross
.eq_oob_cross
.eq_neg_cross
.The “branch to current address” item is problematic if we want to take the branch. Probably we need some tests with short timeouts to handle this properly.
This instruction uses the J
encoding schema, with covergroup enc_j_cg
. The instruction-specific covergroup is insn_jal_cg
.
dir_cp
.offset_align_cp
.oob_cp
.neg_cp
.from_top_cp
.Note that the “jump to current address” item won't be a problem to test since it will quickly overflow the call stack.
This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_jalr_cg
.
off_dir_cp
.<offset>
aligns. Pairs of base address / offset alignments tracked in align_cross
.<offset>
. Tracked as pos_wrap_cp
.<offset>
to give a valid target. Tracked as sub_cp
.neg_wrap_cp
.oob_cp
.self_cp
.from_top_cp
.Note that the “jump to current address” item won‘t be a problem to test since it will quickly over- or underflow the call stack, provided <grd>
and <grs1>
aren’t both x1
.
This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_csrrs_cg
.
bits_to_set
to each valid CSR.These points are tracked with csr_cross
in insn_csrrs_cg
. It crosses csr_cp
(which tracks each valid CSR, plus an invalid CSR) with bits_to_set_cp
(which tracks whether bits_to_set
is nonzero.
This instruction uses the I
encoding schema, with covergroup enc_i_cg
. The instruction-specific covergroup is insn_csrrw_cg
.
<grd>
other than x0
.<grd>
equal to x0
.These points are tracked with csr_cross
in insn_csrrw_cg
. It crosses csr_cp
(which tracks each valid CSR, plus an invalid CSR) with grd_cp_to_set_cp
(which tracks whether grd
is equal to x0
.
This instruction uses the I
encoding schema, but with every field set to a fixed value. Encoding-level coverpoints are tracked in covergroup enc_ecall_cg
.
No special coverage points for this instruction.
This instruction uses the loop
encoding schema, with covergroup enc_loop_cg
.
'1
(the maximal value)This instruction uses the loopi
encoding schema, with covergroup enc_loopi_cg
.
This instruction uses the bnaf
encoding schema, with covergroup enc_bna_cg
.
wrs2
whose top bit is setThis instruction uses the bnaf
encoding schema, with covergroup enc_bna_cg
.
wrs2
whose top bit is setThis instruction uses the bnai
encoding schema, with covergroup enc_bnai_cg
.
No special coverage.
This instruction uses the bnam
encoding schema, with covergroup enc_bnam_cg
.
MOD
(zero and all ones)MOD
) when MOD
is nonzero.MOD
) when MOD
is nonzero.MOD
.MOD
2^256-1
, crossed with whether the subtraction of MOD
results in a value that will wrap.This instruction uses the bnaq
encoding schema, with covergroup enc_bnaq_cg
.
wrs1_qwsel
with wrs2_qwsel
to make sure they are applied to the right inputsThis instruction uses the bnaq
encoding schema, with an extra field not present in bn.mulqacc
. Encoding-level coverpoints are tracked in covergroup enc_bnaqw_cg
.
wrs1_qwsel
with wrs2_qwsel
to make sure they are applied to the right inputsThis instruction uses the bnaq
encoding schema, with an extra field not present in bn.mulqacc
. Encoding-level coverpoints are tracked in covergroup enc_bnaqs_cg
.
wrs1_qwsel
with wrs2_qwsel
to make sure they are applied to the right inputswrd_hwsel
, since the flag changes are different in the two modes.This instruction uses the bnaf
encoding schema, with covergroup enc_bna_cg
.
wrs2
whose top bit is setThis instruction uses the bnaf
encoding schema, with covergroup enc_bna_cg
.
wrs2
whose top bit is setThis instruction uses the bnai
encoding schema, with covergroup enc_bnai_cg
.
No special coverage.
This instruction uses the bnam
encoding schema, with covergroup enc_bnam_cg
.
MOD
(zero and all ones)MOD
(so MOD
is not added).MOD
(so MOD
is added).MOD
(so MOD
is added, but the top bit is still set)-MOD
.This instruction uses the bna
encoding schema, with covergroup enc_bna_cg
.
This instruction uses the bna
encoding schema, with covergroup enc_bna_cg
.
This instruction uses the bnan
encoding schema, with covergroup enc_bnan_cg
.
This instruction uses the bna
encoding schema, with covergroup enc_bna_cg
.
This instruction uses the bnr
encoding schema, with covergroup enc_bnr_cg
.
No special coverage.
This instruction uses the bns
encoding schema, with covergroup enc_bns_cg
.
This instruction uses the bnc
encoding schema, with covergroup enc_bnc_cg
.
wrs2
whose top bit is setThis instruction uses the bnc
encoding schema, with covergroup enc_bnc_cg
.
wrs2
whose top bit is setThis instruction uses the bnxid
encoding schema, with covergroup enc_bnxid_cg
.
grs1
is above the top of memory and a negative offset
brings the load address in range.grs1
is negative and a positive offset
brings the load address in range.grd
greater than 31, giving an illegal instruction errorgrd
with grd_inc
grs1
with grd_inc
This instruction uses the bnxid
encoding schema, with covergroup enc_bnxid_cg
.
grs1
is above the top of memory and a negative offset
brings the load address in range.grs1
is negative and a positive offset
brings the load address in range.grd
greater than 31, giving an illegal instruction errorgrs2
with grs2_inc
grs1
with grd_inc
This instruction uses the bnmov
encoding schema, with covergroup enc_bnmov_cg
.
No special coverage otherwise.
This instruction uses the bnmovr
encoding schema, with covergroup enc_bnmovr_cg
.
grd
is greater than 31 with whether the register value at grs
is greater than 31This instruction uses the bnwcsr
encoding schema, with covergroup enc_wcsr_cg
.
This instruction uses the bnwcsr
encoding schema, with covergroup enc_wcsr_cg
.
Much of the checking for these tests is actually performed in otbn_core_model
, which ensures that the RTL and ISS have the same behaviour. However, the scoreboard does have some checks, to ensure that interrupt and idle signals are high at the expected times.
Core TLUL protocol assertions are checked by binding the [TL-UL protocol checker]({{< relref “hw/ip/tlul/doc/TlulProtocolChecker.md” >}}) into the design.
Outputs are also checked for 'X
values by assertions in the design RTL. The design RTL contains other assertions defined by the designers, which will be checked in simulation (and won't have been checked by the pre-DV Verilator simulations).
Finally, the otbn_idle_checker
checks that the idle_o
output correctly matches the running state that you'd expect, based on writes to the CMD
register and responses that will appear in the DONE
interrupt.
Tests can be run with [dvsim.py
]({{< relref “hw/dv/tools/README.md” >}}). The link gives details of the tool's features and command line arguments. To run a basic smoke test, go to the top of the repository and run:
$ util/dvsim/dvsim.py hw/ip/otbn/dv/uvm/otbn_sim_cfg.hjson -i otbn_smoke
{{< incGenFromIpDesc “../../data/otbn_testplan.hjson” “testplan” >}}