Update lowrisc_ibex to lowRISC/ibex@6b9165f
Update code from upstream repository
https://github.com/lowRISC/ibex.git to revision
6b9165fa66b49534226acfcb739f2c252be4853c
* [doc] Update READMEs with best CoreMark results (Greg Chadwick)
* [sw] Enable choice of -march= string for CoreMark (Greg Chadwick)
* Value passed to UVM set_timeout is calculated as 1000000000 basing
on 1ns/1ps timescale. But if you are using precompiled UVM it may be
compiled with other timescale depending on compilation option used
when it was compiled or tools default timescale value (uvm does not
set timescale int the code). In this case for us precompiled UVM
timescale is 1ps/1ps - so UVM gets 1000000000 in set timeout but
interprets it as ps. As a result timeout is 1000 times smaller that
you expect. That is why we are getting timeouts. It is hard to find
perfect solution. One of them is to recompile the UVM with
-timescale 1ns/ps (or whatever you will use for your design). (Dawid
Zimonczyk)
* update readme for Riviera-PRO (Dawid Zimonczyk)
* correct wrong assignment to enum (Dawid Zimonczyk)
* Lint: Fix some line length warnings (Philipp Wagner)
* ibex_counter: Use always_ff (Philipp Wagner)
* Enforce lint of simple system in CI (Philipp Wagner)
* Specify data type for all parameters in simple_system (Philipp
Wagner)
* Clarifications to the README of the simple system (Philipp Wagner)
* Only include necessary LFSR primitive (Philipp Wagner)
* Update lowrisc_ip to lowRISC/opentitan@ebf4663b (Philipp Wagner)
* [dv/ibex] Add two new interrupt/debug tests (Udi)
* [doc] Fix spelling of CoreMark (Pirmin Vogel)
* Handle --help properly in simple_system top-level (Rupert Swarbrick)
* Update lowrisc_ip to lowRISC/opentitan@9ac4f9c8 (Rupert Swarbrick)
* Use the Xilinx primitives for the Arty board (Philipp Wagner)
* [doc] Clarify that the supported version of the B extension is a
draft (Pirmin Vogel)
* Clean up Verilator sections in core files (Philipp Wagner)
* Fix and waive Verilator lint errors in tb_cs_registers (Philipp
Wagner)
* Remove lowrisc:prim:clock_gating from shared core collections
(Philipp Wagner)
* Add lint for ibex_simple_system to CI (Philipp Wagner)
* ibex_simple_system: Add lint target (Philipp Wagner)
* Simplify lint targets (Philipp Wagner)
* Remove unrelated files from lint in ibex_core_tracing (Philipp
Wagner)
* Add dependency on prim_clock_gating (Philipp Wagner)
* Fix Ibex description in core file (Philipp Wagner)
* icache: Depend on prim_assert (Philipp Wagner)
* Fix SRAM initialisation for fpga/artya example (Rupert Swarbrick)
* Drop SRAM_INIT_FILE from ibex_riscv_compliance.core (Rupert
Swarbrick)
* Get simple_system working for VCS (Rupert Swarbrick)
* Pass MemInitFile parameter from our ram_*p wrappers (Rupert
Swarbrick)
* Update lowrisc_ip to lowRISC/opentitan@976d9b9c (Philipp Wagner)
* Icache: It's not a draft any more (Philipp Wagner)
* Remove outdated documentation (Philipp Wagner)
* CI: Show exact command to run Verilator lint (Philipp Wagner)
* CI: Enable Verible lint for all configs (Philipp Wagner)
* CI: use the new binary name of Verible (Philipp Wagner)
* Add a waiver file for Verible lint (Philipp Wagner)
* Fix Verible lint issues (Philipp Wagner)
* Add some formal cover properties for ICache (Rupert Swarbrick)
* Add an ICacheECC parameter to ICache formal flow (Rupert Swarbrick)
* Formal protocol checking for icache <-> core interface (Rupert
Swarbrick)
* A simple formal flow for the ICache based on SymbiYosys (Rupert
Swarbrick)
* Move riscv-formal code into formal/riscv-formal (Rupert Swarbrick)
* [doc] Add bitmanip spec to introduction page (Philipp Wagner)
* [CI] Update Verible version (Philipp Wagner)
* Update lowrisc_ip to lowRISC/opentitan@5cae0cf1 (Rupert Swarbrick)
* [bitmanip] Optimizations and Parametrization (ganoam)
* [rtl] Fix icache xprop issue (Tom Roberts)
* Prevent writing CSR_SECURESEED to get the seed of dummy instruction
(Xiang Wang)
Signed-off-by: Michael Schaffner <msf@opentitan.org>
diff --git a/hw/vendor/lowrisc_ibex.lock.hjson b/hw/vendor/lowrisc_ibex.lock.hjson
index 5b6afb8..443ebf6 100644
--- a/hw/vendor/lowrisc_ibex.lock.hjson
+++ b/hw/vendor/lowrisc_ibex.lock.hjson
@@ -9,6 +9,6 @@
upstream:
{
url: https://github.com/lowRISC/ibex.git
- rev: ae547c8d3010d86ef7afdee900d461616c3cb414
+ rev: 6b9165fa66b49534226acfcb739f2c252be4853c
}
}
diff --git a/hw/vendor/lowrisc_ibex/README.md b/hw/vendor/lowrisc_ibex/README.md
index e840e7b..7cdbbdb 100644
--- a/hw/vendor/lowrisc_ibex/README.md
+++ b/hw/vendor/lowrisc_ibex/README.md
@@ -20,18 +20,20 @@
The table below indicates performance, area and verification status for a few selected configurations.
These are configurations on which lowRISC is focusing for performance evaluation and design verification (see [supported configs](ibex_configs.yaml)).
-| Config | "small" | "maxperf" | "maxperf-pmp-bm" |
+| Config | "small" | "maxperf" | "maxperf-pmp-bmfull" |
| ------ | ------- | --------- | ---------------- |
| Features | RV32IMC, 3 cycle mult | RV32IMC, 1 cycle mult, Branch target ALU, Writeback stage | RV32IMCB, 1 cycle mult, Branch target ALU, Writeback stage, 16 PMP regions |
-| Performance (Coremark/MHz) | 2.44 | 3.09 | 3.09 |
+| Performance (CoreMark/MHz) | 2.47 | 3.13 | 3.05 |
| Area - Yosys (kGE) | 33.15 | 39.03 | 63.32 |
| Area - Commercial (estimated kGE) | ~27 | ~31 | ~50 |
| Verification status | Green | Amber | Amber |
Notes:
-* Performance numbers are based on Cormark running on the Ibex Simple System [platform](examples/simple_system/README.md).
- Note that Coremark was compiled without support for the B extension.
+* Performance numbers are based on CoreMark running on the Ibex Simple System [platform](examples/simple_system/README.md).
+ Note that different ISAs (use of B and C extensions) give the best results for different configurations.
+ See the [Benchmarks README](examples/sw/benchmarks/README.md) for more information.
+ The "maxperf-pmp-bmfull" configuration sets a `SpecBranch` parameter in `ibex_core.sv`; this helps timing but has a small negative performance impact.
* Yosys synthesis area numbers are based on the Ibex basic synthesis [flow](syn/README.md).
* Commercial synthesis area numbers are a rough estimate of what might be achievable with a commercial synthesis flow and technology library.
* Verification status is a rough guide to the overall maturity of a particular configuration.
diff --git a/hw/vendor/lowrisc_ibex/azure-pipelines.yml b/hw/vendor/lowrisc_ibex/azure-pipelines.yml
index db7212a..3968949 100644
--- a/hw/vendor/lowrisc_ibex/azure-pipelines.yml
+++ b/hw/vendor/lowrisc_ibex/azure-pipelines.yml
@@ -9,7 +9,7 @@
VERILATOR_VERSION: 4.032
RISCV_TOOLCHAIN_TAR_VERSION: 20200323-1
RISCV_COMPLIANCE_GIT_VERSION: 844c6660ef3f0d9b96957991109dfd80cc4938e2
- VERIBLE_VERSION: v0.0-266-g9e55307
+ VERIBLE_VERSION: v0.0-459-g4d6929e
trigger:
batch: true
@@ -97,23 +97,12 @@
fusesoc --version
verilator --version
riscv32-unknown-elf-gcc --version
- verilog_lint --version
+ verible-verilog-lint --version
displayName: Display environment
- # Verible lint/format is experimental so only run on default config for now,
+ # Verible format is experimental so only run on default config for now,
# will eventually become part of the per-config CI
- bash: |
- fusesoc --cores-root . run --no-export --target=lint --tool=veriblelint lowrisc:ibex:ibex_core_tracing
- if [ $? != 0 ]; then
- echo -n "##vso[task.logissue type=error]"
- echo "Verilog lint with Verible failed. Run 'fusesoc --cores-root . run --target=lint --tool=veriblelint lowrisc:ibex:ibex_core_tracing' to check and fix all errors."
- echo "This flow is currently experimental and failures can be ignored."
- exit 1
- fi
- continueOnError: true
- displayName: Lint Verilog source files with Verible (experimental)
-
- - bash: |
fusesoc --cores-root . run --no-export --target=format --tool=veribleformat lowrisc:ibex:ibex_core_tracing
if [ $? != 0 ]; then
echo -n "##vso[task.logissue type=error]"
@@ -159,4 +148,23 @@
ibex_configs:
- small
- experimental-maxperf-pmp
- - experimental-maxperf-pmp-bm
+ - experimental-maxperf-pmp-bmfull
+
+ # Run lint on simple system
+ - bash: |
+ fusesoc --cores-root . run --target=lint --tool=verilator lowrisc:ibex:ibex_simple_system
+ if [ $? != 0 ]; then
+ echo -n "##vso[task.logissue type=error]"
+ echo "Verilog lint with Verilator failed. Run 'fusesoc --cores-root . run --target=lint --tool=verilator lowrisc:ibex:ibex_simple_system' to check and fix all errors."
+ exit 1
+ fi
+ displayName: Run Verilator lint on simple system
+
+ - bash: |
+ fusesoc --cores-root . run --target=lint --tool=veriblelint lowrisc:ibex:ibex_simple_system
+ if [ $? != 0 ]; then
+ echo -n "##vso[task.logissue type=error]"
+ echo "Verilog lint with Verible failed. Run 'fusesoc --cores-root . run --target=lint --tool=veriblelint lowrisc:ibex:ibex_simple_system' to check and fix all errors."
+ exit 1
+ fi
+ displayName: Run Verible lint on simple system
diff --git a/hw/vendor/lowrisc_ibex/ci/ibex-rtl-ci-steps.yml b/hw/vendor/lowrisc_ibex/ci/ibex-rtl-ci-steps.yml
index 71bb8d2..f4cbcd3 100644
--- a/hw/vendor/lowrisc_ibex/ci/ibex-rtl-ci-steps.yml
+++ b/hw/vendor/lowrisc_ibex/ci/ibex-rtl-ci-steps.yml
@@ -17,12 +17,21 @@
fusesoc --cores-root . run --target=lint lowrisc:ibex:ibex_core_tracing $IBEX_CONFIG_OPTS
if [ $? != 0 ]; then
echo -n "##vso[task.logissue type=error]"
- echo "Verilog lint failed. Run 'fusesoc --cores-root . run --target=lint lowrisc:ibex:ibex_core_tracing' to check and fix all errors."
+ echo "Verilog lint failed. Run 'fusesoc --cores-root . run --target=lint --tool=verilator lowrisc:ibex:ibex_core_tracing $IBEX_CONFIG_OPTS' to check and fix all errors."
exit 1
fi
displayName: Lint Verilog source files with Verilator for ${{ config }}
- bash: |
+ fusesoc --cores-root . run --target=lint lowrisc:ibex:ibex_core_tracing $IBEX_CONFIG_OPTS
+ if [ $? != 0 ]; then
+ echo -n "##vso[task.logissue type=error]"
+ echo "Verilog lint failed. Run 'fusesoc --cores-root . run --target=lint --tool=veriblelint lowrisc:ibex:ibex_core_tracing $IBEX_CONFIG_OPTS' to check and fix all errors."
+ exit 1
+ fi
+ displayName: Lint Verilog source files with Verible Verilog Lint for ${{ config }}
+
+ - bash: |
# Build simulation model of Ibex
fusesoc --cores-root=. run --target=sim --setup --build lowrisc:ibex:ibex_riscv_compliance $IBEX_CONFIG_OPTS
if [ $? != 0 ]; then
diff --git a/hw/vendor/lowrisc_ibex/doc/getting_started.rst b/hw/vendor/lowrisc_ibex/doc/getting_started.rst
index afe9ac1..21c5939 100644
--- a/hw/vendor/lowrisc_ibex/doc/getting_started.rst
+++ b/hw/vendor/lowrisc_ibex/doc/getting_started.rst
@@ -12,17 +12,3 @@
Depending on the target technology, either the implementation in ``ibex_register_file_ff.sv`` or the one in ``ibex_register_file_latch.sv`` should be selected.
For more information about the two register file implementations and their trade-offs, check out :ref:`register-file`.
-Clock Gating Cell
------------------
-
-Ibex requires clock gating cells.
-This cells are usually specific to the selected target technology and thus not provided as part of the RTL design.
-It is assumed that the clock gating cell is wrapped in a module called ``prim_clock_gating`` that has the following ports:
-
-* ``clk_i``: Clock Input
-* ``en_i``: Clock Enable Input
-* ``test_en_i``: Test Enable Input (activates the clock even though ``en_i`` is not set)
-* ``clk_o``: Gated Clock Output
-
-Inside Ibex, clock gating cells are used both in ``ibex_core.sv`` and ``ibex_register_file_latch.sv``.
-For more information on the expected behavior of the clock gating cell when using the latch-based register file check out :ref:`register-file`.
diff --git a/hw/vendor/lowrisc_ibex/doc/instruction_decode_execute.rst b/hw/vendor/lowrisc_ibex/doc/instruction_decode_execute.rst
index 00e090f..92cf24b 100644
--- a/hw/vendor/lowrisc_ibex/doc/instruction_decode_execute.rst
+++ b/hw/vendor/lowrisc_ibex/doc/instruction_decode_execute.rst
@@ -64,10 +64,43 @@
* It computes memory addresses for loads and stores with a Reg + Imm calculation
* The LSU uses it to increment addresses when performing two accesses to handle an unaligned access
-Support for the RISC-V Bitmanipulation Extension (Document Version 0.92, November 8, 2019) is enabled via the parameter ``RV32B``.
-This feature is *EXPERIMENTAL* and the details of its impact are not yet documented here.
-Currently the Zbb, Zbs, Zbp, Zbe, Zbf, Zbc, Zbr and Zbt sub-extensions are implemented.
-The rotate instructions `ror` and `rol` (Zbb), ternary instructions `cmov`, `cmix`, `fsl` and `fsr` as well as cyclic redundancy checks `crc32[c]` (Zbr) are completed in 2 cycles. All remaining instructions complete in one cycle.
+Bit Manipulation Extension
+ Support for the `RISC-V Bit Manipulation Extension (draft version 0.92 from November 8, 2019) <https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf>`_ is optional. [#B_draft]_
+ It can be enabled via the enumerated parameter ``RV32B`` defined in :file:`rtl/ibex_pkg.sv`.
+
+ There are two versions of the bit manipulation extension available:
+ The balanced implementation comprises a set of sub-extensions aiming for good benefits at a reasonable area overhead.
+ The full implementation comprises all 32 bit instructions defined in the extension.
+ The following table lists the implemented instructions in each version.
+ Multi-cycle instructions are completed in 2 cycles.
+ All remaining instructions complete in a single cycle.
+
+ +---------------------------------+---------------+--------------------------+
+ | Z-Extension | Version | Multi-Cycle Instructions |
+ +=================================+===============+==========================+
+ | Zbb (Base) | Balanced/Full | rol, ror[i] |
+ +---------------------------------+---------------+--------------------------+
+ | Zbs (Single-bit) | Balanced/Full | None |
+ +---------------------------------+---------------+--------------------------+
+ | Zbp (Permutation) | Full | None |
+ +---------------------------------+---------------+--------------------------+
+ | Zbp (Bit extract/deposit) | Full | All |
+ +---------------------------------+---------------+--------------------------+
+ | Zbf (Bit-field place) | Balanced/Full | All |
+ +---------------------------------+---------------+--------------------------+
+ | Zbc (Carry-less multiply) | Full | None |
+ +---------------------------------+---------------+--------------------------+
+ | Zbr (CRC) | Full | All |
+ +---------------------------------+---------------+--------------------------+
+ | Zbt (Ternary) | Balanced/Full | All |
+ +---------------------------------+---------------+--------------------------+
+ | Zb_tmp (Temporary) [#B_zb_tmp]_ | Balanced/Full | None |
+ +---------------------------------+---------------+--------------------------+
+
+ The implementation of the B-extension comes with an area overhead of 1.8 to 3.0 kGE for the balanced version and 6.0 to 8.7 kGE for the full version.
+ That corresponds to an approximate percentage increase in area of 9 to 14 % and 25 to 30 % for the balanced and full versions respectively.
+ The ranges correspond to synthesis results generated using relaxed and maximum frequency targets respectively.
+ The designs have been synthesized using Synopsys Design Compiler targeting TSMC 65 nm technology.
.. _mult-div:
@@ -129,3 +162,14 @@
The Load-Store Unit (LSU) interfaces with main memory to perform load and store operations.
See :ref:`load-store-unit` for more details.
+
+.. rubric:: Footnotes
+
+.. [#B_draft] Ibex fully implements draft version 0.92 of the RISC-V Bit Manipulation Extension.
+ This extension may change before being ratified as a standard by the RISC-V Foundation.
+ Ibex will be updated to match future versions of the specification.
+ Prior to ratification this may involve backwards incompatible changes.
+ Additionally, neither GCC or Clang have committed to maintaining support upstream for unratified versions of the specification.
+
+.. [#B_zb_tmp] The sign-extend instructions `sext.b/sext.h` are defined but not unambiguously categorized in draft version 0.92 of the extension.
+ Temporarily, they have been assigned a separate Z-extension (Zb_tmp) both in Ibex and the RISCV-DV random instruction generator used to verify the bit manipulation instructions in Ibex.
diff --git a/hw/vendor/lowrisc_ibex/doc/integration.rst b/hw/vendor/lowrisc_ibex/doc/integration.rst
index d410bbf..f7233d3 100644
--- a/hw/vendor/lowrisc_ibex/doc/integration.rst
+++ b/hw/vendor/lowrisc_ibex/doc/integration.rst
@@ -12,21 +12,21 @@
.. code-block:: verilog
ibex_core #(
- .PMPEnable ( 0 ),
- .PMPGranularity ( 0 ),
- .PMPNumRegions ( 4 ),
- .MHPMCounterNum ( 0 ),
- .MHPMCounterWidth ( 40 ),
- .RV32E ( 0 ),
- .RV32M ( 1 ),
- .RV32B ( 0 ),
- .MultiplierImplementation ( "fast" ),
- .ICache ( 0 ),
- .ICacheECC ( 0 ),
- .SecureIbex ( 0 ),
- .DbgTriggerEn ( 0 ),
- .DmHaltAddr ( 32'h1A110800 ),
- .DmExceptionAddr ( 32'h1A110808 )
+ .PMPEnable ( 0 ),
+ .PMPGranularity ( 0 ),
+ .PMPNumRegions ( 4 ),
+ .MHPMCounterNum ( 0 ),
+ .MHPMCounterWidth ( 40 ),
+ .RV32E ( 0 ),
+ .RV32M ( 1 ),
+ .RV32B ( ibex_pkg::RV32BNone ),
+ .MultiplierImplementation ( "fast" ),
+ .ICache ( 0 ),
+ .ICacheECC ( 0 ),
+ .SecureIbex ( 0 ),
+ .DbgTriggerEn ( 0 ),
+ .DmHaltAddr ( 32'h1A110800 ),
+ .DmExceptionAddr ( 32'h1A110808 )
) u_core (
// Clock and reset
.clk_i (),
@@ -74,55 +74,55 @@
Parameters
----------
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| Name | Type/Range | Default | Description |
-+==============================+=============+============+=================================================================+
-| ``PMPEnable`` | bit | 0 | Enable PMP support |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``PMPGranularity`` | int (0..31) | 0 | Minimum granularity of PMP address matching |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``PMPNumRegions`` | int (1..16) | 4 | Number implemented PMP regions (ignored if PMPEnable == 0) |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``MHPMCounterNum`` | int (0..10) | 0 | Number of performance monitor event counters |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``MHPMCounterWidth`` | int (64..1) | 40 | Bit width of performance monitor event counters |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``RV32E`` | bit | 0 | RV32E mode enable (16 integer registers only) |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``RV32M`` | bit | 1 | M(ultiply) extension enable |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``RV32B`` | bit | 0 | *EXPERIMENTAL* - B(itmanipulation) extension enable: |
-| | | | Currently supported Z-extensions: Zbb (base), Zbs (single-bit) |
-| | | | Zbp (bit permutation), Zbe (bit extract/deposit), |
-| | | | Zbf (bit-field place) Zbc (carry-less multiplication) |
-| | | | Zbr (cyclic redundancy check) and Zbt (ternary) |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``BranchTargetALU`` | bit | 0 | *EXPERIMENTAL* - Enables branch target ALU removing a stall |
-| | | | cycle from taken branches |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``WritebackStage`` | bit | 0 | *EXPERIMENTAL* - Enables third pipeline stage (writeback) |
-| | | | improving performance of loads and stores |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``MultiplierImplementation`` | string | "fast" | Multiplicator type: |
-| | | | "slow": multi-cycle slow, |
-| | | | "fast": multi-cycle fast, |
-| | | | "single-cycle": single-cycle |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``ICache`` | bit | 0 | *EXPERIMENTAL* Enable instruction cache instead of prefetch |
-| | | | buffer |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``ICacheECC`` | bit | 0 | *EXPERIMENTAL* Enable SECDED ECC protection in ICache (if |
-| | | | ICache == 1) |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``SecureIbex`` | bit | 0 | *EXPERIMENTAL* Enable various additional features targeting |
-| | | | secure code execution. |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``DbgTriggerEn`` | bit | 0 | Enable debug trigger support (one trigger only) |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``DmHaltAddr`` | int | 0x1A110800 | Address to jump to when entering Debug Mode |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
-| ``DmExceptionAddr`` | int | 0x1A110808 | Address to jump to when an exception occurs while in Debug Mode |
-+------------------------------+-------------+------------+-----------------------------------------------------------------+
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| Name | Type/Range | Default | Description |
++==============================+===================+============+=================================================================+
+| ``PMPEnable`` | bit | 0 | Enable PMP support |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``PMPGranularity`` | int (0..31) | 0 | Minimum granularity of PMP address matching |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``PMPNumRegions`` | int (1..16) | 4 | Number implemented PMP regions (ignored if PMPEnable == 0) |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``MHPMCounterNum`` | int (0..10) | 0 | Number of performance monitor event counters |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``MHPMCounterWidth`` | int (64..1) | 40 | Bit width of performance monitor event counters |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``RV32E`` | bit | 0 | RV32E mode enable (16 integer registers only) |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``RV32M`` | bit | 1 | M(ultiply) extension enable |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``RV32B`` | ibex_pkg::rv32b_e | RV32BNone | *EXPERIMENTAL* - B(itmanipulation) extension select: |
+| | | | "RV32BNone": No B-extension |
+| | | | "RV32BBalanced": Sub-extensions Zbb, Zbs, Zbf and |
+| | | | Zbt |
+| | | | "RV32Full": All sub-extensions |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``BranchTargetALU`` | bit | 0 | *EXPERIMENTAL* - Enables branch target ALU removing a stall |
+| | | | cycle from taken branches |
++------------------------------+------------------ +------------+-----------------------------------------------------------------+
+| ``WritebackStage`` | bit | 0 | *EXPERIMENTAL* - Enables third pipeline stage (writeback) |
+| | | | improving performance of loads and stores |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``MultiplierImplementation`` | string | "fast" | Multiplicator type: |
+| | | | "slow": multi-cycle slow, |
+| | | | "fast": multi-cycle fast, |
+| | | | "single-cycle": single-cycle |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``ICache`` | bit | 0 | *EXPERIMENTAL* Enable instruction cache instead of prefetch |
+| | | | buffer |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``ICacheECC`` | bit | 0 | *EXPERIMENTAL* Enable SECDED ECC protection in ICache (if |
+| | | | ICache == 1) |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``SecureIbex`` | bit | 0 | *EXPERIMENTAL* Enable various additional features targeting |
+| | | | secure code execution. |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``DbgTriggerEn`` | bit | 0 | Enable debug trigger support (one trigger only) |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``DmHaltAddr`` | int | 0x1A110800 | Address to jump to when entering Debug Mode |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
+| ``DmExceptionAddr`` | int | 0x1A110808 | Address to jump to when an exception occurs while in Debug Mode |
++------------------------------+-------------------+------------+-----------------------------------------------------------------+
Any parameter marked *EXPERIMENTAL* when enabled is not verified to the same standard as the rest of the Ibex core.
diff --git a/hw/vendor/lowrisc_ibex/doc/introduction.rst b/hw/vendor/lowrisc_ibex/doc/introduction.rst
index e0d8c0c..a914c5a 100644
--- a/hw/vendor/lowrisc_ibex/doc/introduction.rst
+++ b/hw/vendor/lowrisc_ibex/doc/introduction.rst
@@ -21,6 +21,7 @@
* `RISC-V Instruction Set Manual, Volume II: Privileged Architecture, document version 20190608-Base-Ratified (June 8, 2019) <https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMFDQC-and-Priv-v1.11/riscv-privileged-20190608.pdf>`_.
Ibex implements the Machine ISA version 1.11.
* `RISC-V External Debug Support, version 0.13.2 <https://content.riscv.org/wp-content/uploads/2019/03/riscv-debug-release.pdf>`_
+* `RISC-V Bit Manipulation Extension, version 0.92 (draft from November 8, 2019) <https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf>`_
Many features in the RISC-V specification are optional, and Ibex can be parametrized to enable or disable some of them.
@@ -46,6 +47,10 @@
- 2.0
- optional
+ * - **B**: Draft Extension for Bit Manipulation Instructions
+ - 0.92 [#B_draft]_
+ - optional
+
* - **Zicsr**: Control and Status Register Instructions
- 2.0
- always enabled
@@ -110,3 +115,10 @@
----------
1. `Schiavone, Pasquale Davide, et al. "Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications." 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS 2017) <https://doi.org/10.1109/PATMOS.2017.8106976>`_
+
+.. rubric:: Footnotes
+
+.. [#B_draft] Note that while Ibex fully implements draft version 0.92 of the RISC-V Bit Manipulation Extension, this extension may change before being ratified as a standard by the RISC-V Foundation.
+ Ibex will be updated to match future versions of the specification.
+ Prior to ratification this may involve backwards incompatible changes.
+ Additionally, neither GCC or Clang have committed to maintaining support upstream for unratified versions of the specification.
diff --git a/hw/vendor/lowrisc_ibex/examples/fpga/artya7/rtl/top_artya7.sv b/hw/vendor/lowrisc_ibex/examples/fpga/artya7/rtl/top_artya7.sv
index 2722719..6949fd3 100644
--- a/hw/vendor/lowrisc_ibex/examples/fpga/artya7/rtl/top_artya7.sv
+++ b/hw/vendor/lowrisc_ibex/examples/fpga/artya7/rtl/top_artya7.sv
@@ -8,9 +8,10 @@
output [3:0] LED
);
- parameter int MEM_SIZE = 64 * 1024; // 64 kB
- parameter logic [31:0] MEM_START = 32'h00000000;
- parameter logic [31:0] MEM_MASK = MEM_SIZE-1;
+ parameter int MEM_SIZE = 64 * 1024; // 64 kB
+ parameter logic [31:0] MEM_START = 32'h00000000;
+ parameter logic [31:0] MEM_MASK = MEM_SIZE-1;
+ parameter SRAMInitFile = "";
logic clk_sys, rst_sys_n;
@@ -104,7 +105,8 @@
// SRAM block for instruction and data storage
ram_1p #(
- .Depth(MEM_SIZE / 4)
+ .Depth(MEM_SIZE / 4),
+ .MemInitFile(SRAMInitFile)
) u_ram (
.clk_i ( clk_sys ),
.rst_ni ( rst_sys_n ),
diff --git a/hw/vendor/lowrisc_ibex/examples/fpga/artya7/top_artya7.core b/hw/vendor/lowrisc_ibex/examples/fpga/artya7/top_artya7.core
index 6b903b4..cfa1a33 100644
--- a/hw/vendor/lowrisc_ibex/examples/fpga/artya7/top_artya7.core
+++ b/hw/vendor/lowrisc_ibex/examples/fpga/artya7/top_artya7.core
@@ -21,15 +21,15 @@
parameters:
# XXX: This parameter needs to be absolute, or relative to the *.runs/synth_1
# directory. It's best to pass it as absolute path when invoking fusesoc, e.g.
- # --SRAM_INIT_FILE=$PWD/sw/led/led.vmem
+ # --SRAMInitFile=$PWD/sw/led/led.vmem
# XXX: The VMEM file should be added to the sources of the Vivado project to
# make the Vivado dependency tracking work. However this requires changes to
# fusesoc first.
- SRAM_INIT_FILE:
+ SRAMInitFile:
datatype: str
description: SRAM initialization file in vmem hex format
default: "../../../../../examples/sw/led/led.vmem"
- paramtype: vlogdefine
+ paramtype: vlogparam
FPGA_XILINX:
datatype: str
@@ -37,6 +37,12 @@
default: 1
paramtype: vlogdefine
+ # For value definition, please see ip/prim/rtl/prim_pkg.sv
+ PRIM_DEFAULT_IMPL:
+ datatype: str
+ paramtype: vlogdefine
+ description: Primitives implementation to use, e.g. "prim_pkg::ImplGeneric".
+
targets:
synth:
default_tool: vivado
@@ -45,8 +51,9 @@
- files_constraints
toplevel: top_artya7
parameters:
- - SRAM_INIT_FILE
+ - SRAMInitFile
- FPGA_XILINX
+ - PRIM_DEFAULT_IMPL=prim_pkg::ImplXilinx
tools:
vivado:
part: "xc7a100tcsg324-1" # Default to Arty A7-100
diff --git a/hw/vendor/lowrisc_ibex/examples/simple_system/README.md b/hw/vendor/lowrisc_ibex/examples/simple_system/README.md
index 9a921b9..2eb2944 100644
--- a/hw/vendor/lowrisc_ibex/examples/simple_system/README.md
+++ b/hw/vendor/lowrisc_ibex/examples/simple_system/README.md
@@ -14,9 +14,13 @@
* [Verilator](https://www.veripool.org/wiki/verilator)
Note Linux package managers may have Verilator but often a very old version
that is not suitable. It is recommended Verilator is built from source.
-* [FuseSoC](https://github.com/olofk/fusesoc)
+* The Python dependencies of this repository.
+ Install them with `pip3 install -U python3-requirements.txt` from the
+ repository root.
* RISC-V Compiler Toolchain - lowRISC provides a pre-built GCC based toolchain
<https://github.com/lowRISC/lowrisc-toolchains/releases>
+* libelf and its development libraries.
+ On Debian/Ubuntu, install it by running `apt-get install libelf-dev`.
## Building Simulation
@@ -37,9 +41,9 @@
make -C examples/sw/simple_system/hello_test
```
-This should create the file
-`examples/sw/simple_system/hello_test/hello_test.vmem` which is the memory
-initialisation file used to run the `hello_test` program.
+The compiled program is available at
+`examples/sw/simple_system/hello_test/hello_test.elf`. The same directory also
+contains a Verilog memory file (vmem file) to be used with some simulators.
To build new software make a copy of the `hello_test` directory named as desired.
Look inside the Makefile for further instructions.
@@ -53,11 +57,11 @@
Having built the simulator and software, from the Ibex repository root run:
```
-./build/lowrisc_ibex_ibex_simple_system_0/sim-verilator/Vibex_simple_system [-t] --meminit=ram,<sw_vmem_file>
+./build/lowrisc_ibex_ibex_simple_system_0/sim-verilator/Vibex_simple_system [-t] --meminit=ram,<sw_elf_file>
```
-`<sw_vmem_file>` should be a path to a Verilog memory (vmem) file, or an ELF
-file built as described above. Use
+`<sw_elf_file>` should be a path to an ELF file (or alternatively a vmem file)
+built as described above. Use
`./examples/sw/simple_system/hello_test/hello_test.elf` to run the `hello_test`
binary.
@@ -91,7 +95,7 @@
The simulator produces several output files
* `ibex_simple_system.log` - The ASCII output written via the output peripheral
-* `ibex_simple_system_pcount.csv` - A csv of the performance counters
+* `ibex_simple_system_pcount.csv` - A CSV of the performance counters
* `trace_core_00000000.log` - An instruction trace of execution
## Simulating with Synopsys VCS
@@ -99,7 +103,7 @@
Similar to the Verilator flow the Simple System simulator binary can be built using:
```
-fusesoc --cores-root=. run --target=sim --tool=vcs --setup --build lowrisc:ibex:ibex_simple_system --RV32M=1 --RV32E=0 --SRAM_INIT_FILE=`<sw_vmem_file>`
+fusesoc --cores-root=. run --target=sim --tool=vcs --setup --build lowrisc:ibex:ibex_simple_system --RV32M=1 --RV32E=0 --SRAMInitFile=`<sw_vmem_file>`
```
`<sw_vmem_file>` should be a path to a vmem file built as described above, use
@@ -119,7 +123,7 @@
To build and run Simple System run:
```
-fusesoc --cores-root=. run --target=sim --tool=rivierapro lowrisc:ibex:ibex_simple_system --RV32M=1 --RV32E=0 --SRAM_INIT_FILE=<sw_vmem_file>
+fusesoc --cores-root=. run --target=sim --tool=rivierapro lowrisc:ibex:ibex_simple_system --RV32M=1 --RV32E=0 --SRAMInitFile=\"$(readlink -f <sw_vmem_file>)\"
```
`<sw_vmem_file>` should be a path to a vmem file built as described above, use
@@ -137,4 +141,3 @@
| 0x30008 | RISC-V timer `mtimecmp` register |
| 0x3000C | RISC-V timer `mtimecmph` register |
| 0x100000 – 0x1FFFFF | 1 MB memory for instruction and data. Execution starts at 0x100080, exception handler base is 0x100000 |
-
diff --git a/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.cc b/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.cc
index bcad891..99b8d65 100644
--- a/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.cc
+++ b/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.cc
@@ -21,11 +21,19 @@
"ram", "TOP.ibex_simple_system.u_ram.u_ram.gen_generic.u_impl_generic");
simctrl.RegisterExtension(&memutil);
+ bool exit_app = false;
+ int ret_code = simctrl.ParseCommandArgs(argc, argv, exit_app);
+ if (exit_app) {
+ return ret_code;
+ }
+
std::cout << "Simulation of Ibex" << std::endl
<< "==================" << std::endl
<< std::endl;
- if (simctrl.Exec(argc, argv)) {
+ simctrl.RunSimulation();
+
+ if (!simctrl.WasSimulationSuccessful()) {
return 1;
}
@@ -33,9 +41,7 @@
// doesn't know the scope itself. Could be moved to ibex_pcount_string, but
// would require a way to set the scope name from here, similar to MemUtil.
svSetScope(svGetScopeFromName("TOP.ibex_simple_system"));
- // TODO: Exec can return with "true" (e.g. with `-h`), but that does not mean
- // `RunSimulation()` was executed. The folllowing values will not be useful
- // in this case.
+
std::cout << "\nPerformance Counters" << std::endl
<< "====================" << std::endl;
std::cout << ibex_pcount_string(false);
diff --git a/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.core b/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.core
index 46db539..1f56cb8 100644
--- a/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.core
+++ b/hw/vendor/lowrisc_ibex/examples/simple_system/ibex_simple_system.core
@@ -5,23 +5,23 @@
name: "lowrisc:ibex:ibex_simple_system"
description: "Generic simple system for running binaries on ibex using verilator"
filesets:
- files_sim_verilator:
+ files_sim:
+ depend:
+ - lowrisc:ibex:ibex_core_tracing
+ - lowrisc:ibex:sim_shared
+ files:
+ - rtl/ibex_simple_system.sv
+ file_type: systemVerilogSource
+
+ files_verilator:
depend:
- lowrisc:dv_verilator:memutil_verilator
- lowrisc:dv_verilator:simutil_verilator
- - lowrisc:ibex:ibex_core_tracing
- - lowrisc:ibex:sim_shared
- lowrisc:dv_verilator:ibex_pcounts
files:
- - rtl/ibex_simple_system.sv
- ibex_simple_system.cc: { file_type: cppSource }
- file_type: systemVerilogSource
-
- files_verilator_waiver:
- files:
- lint/verilator_waiver.vlt: {file_type: vlt}
-
parameters:
RV32M:
datatype: int
@@ -36,14 +36,14 @@
description: "Enable the E ISA extension (reduced register set) [0/1]"
RV32B:
- datatype: int
- paramtype: vlogparam
- default: 0
- description: "Enable the B ISA extension (bit manipulation EXPERIMENTAL) [0/1]"
-
- SRAM_INIT_FILE:
datatype: str
+ default: ibex_pkg::RV32BNone
paramtype: vlogdefine
+ description: "Bitmanip implementation parameter enum. See ibex_pkg.sv (EXPERIMENTAL)"
+
+ SRAMInitFile:
+ datatype: str
+ paramtype: vlogparam
description: "Path to a vmem file to initialize the RAM with"
MultiplierImplementation:
@@ -85,13 +85,9 @@
targets:
default: &default_target
filesets:
- - tool_verilator ? (files_verilator_waiver)
- - files_sim_verilator
+ - tool_verilator ? (files_verilator)
+ - files_sim
toplevel: ibex_simple_system
-
- sim:
- <<: *default_target
- default_tool: verilator
parameters:
- RV32M
- RV32E
@@ -102,7 +98,23 @@
- PMPEnable
- PMPGranularity
- PMPNumRegions
- - SRAM_INIT_FILE
+ - SRAMInitFile
+
+ lint:
+ <<: *default_target
+ default_tool: verilator
+ tools:
+ verilator:
+ mode: lint-only
+ verilator_options:
+ - "-Wall"
+ # RAM primitives wider than 64bit (required for ECC) fail to build in
+ # Verilator without increasing the unroll count (see Verilator#1266)
+ - "--unroll-count 72"
+
+ sim:
+ <<: *default_target
+ default_tool: verilator
tools:
vcs:
vcs_options:
@@ -111,26 +123,16 @@
verilator:
mode: cc
verilator_options:
-# Disabling tracing reduces compile times by multiple times, but doesn't have a
-# huge influence on runtime performance. (Based on early observations.)
+ # Disabling tracing reduces compile times but doesn't have a
+ # huge influence on runtime performance.
- '--trace'
- '--trace-fst' # this requires -DVM_TRACE_FMT_FST in CFLAGS below!
- '--trace-structs'
- '--trace-params'
- '--trace-max-array 1024'
-# compiler flags
-#
-# -O
-# Optimization levels have a large impact on the runtime performance of the
-# simulation model. -O2 and -O3 are pretty similar, -Os is slower than -O2/-O3
- - '-CFLAGS "-std=c++11 -Wall -DVM_TRACE_FMT_FST -DTOPLEVEL_NAME=ibex_simple_system -g -O0"'
+ - '-CFLAGS "-std=c++11 -Wall -DVM_TRACE_FMT_FST -DTOPLEVEL_NAME=ibex_simple_system -g"'
- '-LDFLAGS "-pthread -lutil -lelf"'
- "-Wall"
- - "-Wno-PINCONNECTEMPTY"
- # XXX: Cleanup all warnings and remove this option
- # (or make it more fine-grained at least)
- - "-Wno-fatal"
# RAM primitives wider than 64bit (required for ECC) fail to build in
# Verilator without increasing the unroll count (see Verilator#1266)
- "--unroll-count 72"
-
diff --git a/hw/vendor/lowrisc_ibex/examples/simple_system/rtl/ibex_simple_system.sv b/hw/vendor/lowrisc_ibex/examples/simple_system/rtl/ibex_simple_system.sv
index a37bd99..06291ab 100644
--- a/hw/vendor/lowrisc_ibex/examples/simple_system/rtl/ibex_simple_system.sv
+++ b/hw/vendor/lowrisc_ibex/examples/simple_system/rtl/ibex_simple_system.sv
@@ -2,6 +2,10 @@
// Licensed under the Apache License, Version 2.0, see LICENSE for details.
// SPDX-License-Identifier: Apache-2.0
+`ifndef RV32B
+ `define RV32B ibex_pkg::RV32BNone
+`endif
+
/**
* Ibex simple system
*
@@ -19,15 +23,16 @@
input IO_RST_N
);
- parameter bit PMPEnable = 1'b0;
- parameter int unsigned PMPGranularity = 0;
- parameter int unsigned PMPNumRegions = 4;
- parameter bit RV32E = 1'b0;
- parameter bit RV32M = 1'b1;
- parameter bit RV32B = 1'b0;
- parameter bit BranchTargetALU = 1'b0;
- parameter bit WritebackStage = 1'b0;
- parameter MultiplierImplementation = "fast";
+ parameter bit PMPEnable = 1'b0;
+ parameter int unsigned PMPGranularity = 0;
+ parameter int unsigned PMPNumRegions = 4;
+ parameter bit RV32E = 1'b0;
+ parameter bit RV32M = 1'b1;
+ parameter ibex_pkg::rv32b_e RV32B = `RV32B;
+ parameter bit BranchTargetALU = 1'b0;
+ parameter bit WritebackStage = 1'b0;
+ parameter MultiplierImplementation = "fast";
+ parameter SRAMInitFile = "";
logic clk_sys = 1'b0, rst_sys_n;
@@ -41,8 +46,8 @@
Timer
} bus_device_e;
- localparam NrDevices = 3;
- localparam NrHosts = 1;
+ localparam int NrDevices = 3;
+ localparam int NrHosts = 1;
// interrupts
logic timer_irq;
@@ -194,7 +199,8 @@
// SRAM block for instruction and data storage
ram_2p #(
- .Depth(1024*1024/4)
+ .Depth(1024*1024/4),
+ .MemInitFile(SRAMInitFile)
) u_ram (
.clk_i (clk_sys),
.rst_ni (rst_sys_n),
diff --git a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/README.md b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/README.md
index 2d9ecca..1ed303c 100644
--- a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/README.md
+++ b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/README.md
@@ -16,23 +16,23 @@
See examples/simple_system/README.md for full details.
-## Coremark
+## CoreMark
-Coremark (https://www.eembc.org/coremark/ https://github.com/eembc/coremark) is
+CoreMark (https://www.eembc.org/coremark/ https://github.com/eembc/coremark) is
an industry standard benchmark with results available for a wide variety of
systems.
-The Coremark source is vendored into the Ibex repository at
-`vendor/eembc_coremark`. Support structure and a makefile to build Coremark for
+The CoreMark source is vendored into the Ibex repository at
+`vendor/eembc_coremark`. Support structure and a makefile to build CoreMark for
running on simple system is found in `examples/sw/benchmarks/coremark`.
-To build Coremark:
+To build CoreMark:
```
make -C ./examples/sw/benchmarks/coremark/
```
-To run Coremark (after building a suitable simulator binary, see above):
+To run CoreMark (after building a suitable simulator binary, see above):
```
build/lowrisc_ibex_ibex_simple_system_0/sim-verilator/Vibex_simple_system --meminit=ram,examples/sw/benchmarks/coremark/coremark.elf
@@ -41,7 +41,7 @@
The simulator outputs the performance counter values observed for the benchmark
(the counts do not include anything from pre or post benchmark loops).
-Coremark should output (to `ibex_simple_system.log`) something like the
+CoreMark should output (to `ibex_simple_system.log`) something like the
following:
```
@@ -62,30 +62,56 @@
Correct operation validated. See README.md for run and reporting rules.
```
-A Coremark score is given as the number of iterations executed per second. The
-Coremark binary is hard-coded to execute 10 iterations (see
+### Choice of ISA string
+
+Different ISAs (to choose different RISC-V ISA extensions) can be selected by
+passing the desired ISA string into `RV_ISA` when invoking make.
+
+```
+make -C ./examples/sw/bencharmsk/coremark clean
+make -C ./examples/sw/benchmarks/coremark RV_ISA=rv32imc
+```
+
+This will build CoreMark using the 'C' extension (compressed instructions).
+
+When changing `RV_ISA`, you must clean out any old build with `make clean` and
+rebuild.
+
+The following ISA strings give the best performance for the Ibex configurations
+listed in the README:
+
+| Config | Best ISA |
+|----------------------|----------|
+| "small" | rv32im |
+| "maxperf" | rv32im |
+| "maxperf-pmp-bmfull" | rv32imcb |
+
+### CoreMark score
+
+A CoreMark score is given as the number of iterations executed per second. The
+CoreMark binary is hard-coded to execute 10 iterations (see
`examples/sw/benchmarks/coremark/Makefile` if you wish to alter this). To obtain
-a useful Coremark score from the simulation you need to choose a clock speed the
+a useful CoreMark score from the simulation you need to choose a clock speed the
Ibex implementation you are interested in would run at, e.g. 100 MHz, taking
the above example:
* 10 iterations take 4244465 clock cycles
* So at 100 MHz Ibex would execute (100 * 10^6) / (4244465 / 10) = 235.6
Iterations in 1 second.
-* Coremark (at 100 MHz) is 235.6
+* CoreMark (at 100 MHz) is 235.6
-Coremark/MHz is often used instead of a raw Coremark score. The example above
-gives a Coremark/MHz of 2.36 (235.6 / 100 rounded to 2 decimal places).
+CoreMark/MHz is often used instead of a raw CoreMark score. The example above
+gives a CoreMark/MHz of 2.36 (235.6 / 100 rounded to 2 decimal places).
-To directly produce Coremark/MHz from the number of iterations (I) and total
+To directly produce CoreMark/MHz from the number of iterations (I) and total
ticks (T) use the follow formula:
```
-Coremark/MHz = (10 ^ 6) * I / T
+CoreMark/MHz = (10 ^ 6) * I / T
```
-Note that `core_main.c` from Coremark has had a minor modification to prevent it
+Note that `core_main.c` from CoreMark has had a minor modification to prevent it
from reporting an error if it executes for less than 10 seconds. This violates
the run reporting rules (though does not effect benchmark execution). It is
trivial to restore `core_main.c` to the version supplied by EEMBC in the
-Coremark repository if an official result is desired.
+CoreMark repository if an official result is desired.
diff --git a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.c b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.c
index 353c31b..e32431d 100644
--- a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.c
+++ b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.c
@@ -160,7 +160,7 @@
Test for some common mistakes.
*/
void portable_init(core_portable *p, int *argc, char *argv[]) {
- ee_printf("Ibex coremark platform init...\n");
+ ee_printf("Ibex CoreMark platform init...\n");
if (sizeof(ee_ptr_int) != sizeof(ee_u8 *)) {
ee_printf(
"ERROR! Please define ee_ptr_int to a type that holds a pointer!\n");
@@ -181,7 +181,7 @@
coremark_mhz = (1000000.0f * (float)ITERATIONS) / elapsed;
- ee_printf("Coremark / MHz: %f\n", coremark_mhz);
+ ee_printf("CoreMark / MHz: %f\n", coremark_mhz);
p->portable_id = 0;
}
diff --git a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.h b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.h
index 894b880..4117de4 100644
--- a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.h
+++ b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.h
@@ -76,7 +76,7 @@
*Imprtant* :
ee_ptr_int needs to be the data type used to hold pointers, otherwise
- coremark may fail!!!
+ CoreMark may fail!!!
*/
typedef signed short ee_s16;
typedef unsigned short ee_u16;
diff --git a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.mak b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.mak
index 97dc316..6f685b9 100755
--- a/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.mak
+++ b/hw/vendor/lowrisc_ibex/examples/sw/benchmarks/coremark/ibex/core_portme.mak
@@ -4,6 +4,8 @@
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
+RV_ISA = rv32im
+
OUTFILES = $(OPATH)coremark.dis $(OPATH)coremark.map
NAME = coremark
@@ -27,7 +29,7 @@
AS = riscv32-unknown-elf-as
# Flag : CFLAGS
# Use this flag to define compiler options. Note, you can add compiler options from the command line using XCFLAGS="other flags"
-PORT_CFLAGS = -g -march=rv32imc -mabi=ilp32 -static -mcmodel=medlow -mtune=sifive-3-series \
+PORT_CFLAGS = -g -march=$(RV_ISA) -mabi=ilp32 -static -mcmodel=medlow -mtune=sifive-3-series \
-O3 -falign-functions=16 -funroll-all-loops \
-finline-functions -falign-jumps=4 \
-nostdlib -nostartfiles -ffreestanding -mstrict-align \
diff --git a/hw/vendor/lowrisc_ibex/formal/icache/Makefile b/hw/vendor/lowrisc_ibex/formal/icache/Makefile
new file mode 100644
index 0000000..90edfe4
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/formal/icache/Makefile
@@ -0,0 +1,35 @@
+# Copyright lowRISC contributors.
+# Licensed under the Apache License, Version 2.0, see LICENSE for details.
+# SPDX-License-Identifier: Apache-2.0
+
+# A simple wrapper around fusesoc to make it a bit easier to run the formal flow
+
+# Whether to use ECC (0 for disabled; 1 for enabled)
+ECC := 0
+
+core-name := lowrisc:fpv:ibex_icache_fpv
+vlnv := $(subst :,_,$(core-name))
+build-root := $(abspath ../../build/$(vlnv))
+
+fusesoc-params := --ICacheECC=$(ECC)
+
+# Since we have a hacky hook that runs sv2v in place on fusesoc's
+# copied source files, we have to generate different build roots for
+# the two flavours (otherwise bad things will happen if you run make
+# -j2)
+mk-build-root = $(abspath ../../build/$(vlnv)/$(1))
+
+mk-fusesoc-cmd = \
+ fusesoc --cores-root=../.. \
+ run --build-root=$(call mk-build-root,$(1)) \
+ --target=$(1) \
+ $(core-name) $(fusesoc-params)
+
+.PHONY: all prove cover
+all: prove cover
+
+prove cover:
+ $(call mk-fusesoc-cmd,$@)
+
+lint:
+ mypy --strict sv2v_in_place.py
diff --git a/hw/vendor/lowrisc_ibex/formal/icache/formal_tb.sv b/hw/vendor/lowrisc_ibex/formal/icache/formal_tb.sv
new file mode 100644
index 0000000..7d8bcc6
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/formal/icache/formal_tb.sv
@@ -0,0 +1,700 @@
+// Copyright lowRISC contributors.
+// Licensed under the Apache License, Version 2.0, see LICENSE for details.
+// SPDX-License-Identifier: Apache-2.0
+
+// A formal testbench for the ICache. This gets bound into the actual ICache DUT.
+
+`include "prim_assert.sv"
+
+// A macro to emulate |-> (a syntax that Yosys doesn't currently support).
+`define IMPLIES(a, b) ((b) || (!(a)))
+
+`define IS_ONE_HOT(expr, width) \
+ !((expr) & ((expr) - {{(width)-1{1'b0}}, 1'b1}))
+
+module formal_tb #(
+ // DUT parameters
+ parameter int unsigned BusWidth = 32,
+ parameter int unsigned CacheSizeBytes = 4*1024,
+ parameter bit ICacheECC = 1'b0,
+ parameter int unsigned LineSize = 64,
+ parameter int unsigned NumWays = 2,
+ parameter bit SpecRequest = 1'b0,
+ parameter bit BranchCache = 1'b0,
+
+ // Internal parameters / localparams
+ parameter int unsigned ADDR_W = 32,
+ parameter int unsigned NUM_FB = 4,
+ parameter int unsigned LINE_W = 3,
+ parameter int unsigned BUS_BYTES = BusWidth/8,
+ parameter int unsigned BUS_W = $clog2(BUS_BYTES),
+ parameter int unsigned LINE_BEATS = 2,
+ parameter int unsigned LINE_BEATS_W = 1
+) (
+ // Top-level ports
+ input logic clk_i,
+ input logic rst_ni,
+ input logic req_i,
+ input logic branch_i,
+ input logic branch_spec_i,
+ input logic [31:0] addr_i,
+ input logic ready_i,
+ input logic valid_o,
+ input logic [31:0] rdata_o,
+ input logic [31:0] addr_o,
+ input logic err_o,
+ input logic err_plus2_o,
+ input logic instr_req_o,
+ input logic instr_gnt_i,
+ input logic [31:0] instr_addr_o,
+ input logic [BusWidth-1:0] instr_rdata_i,
+ input logic instr_err_i,
+ input logic instr_pmp_err_i,
+ input logic instr_rvalid_i,
+ input logic icache_enable_i,
+ input logic icache_inval_i,
+ input logic busy_o,
+
+ // Internal signals
+ input logic [ADDR_W-1:0] prefetch_addr_q,
+ input logic [NUM_FB-1:0][NUM_FB-1:0] fill_older_q,
+ input logic [NUM_FB-1:0] fill_busy_q,
+ input logic [NUM_FB-1:0] fill_stale_q,
+ input logic [NUM_FB-1:0] fill_hit_q,
+ input logic [NUM_FB-1:0][LINE_BEATS_W:0] fill_ext_cnt_q,
+ input logic [NUM_FB-1:0] fill_ext_hold_q,
+ input logic [NUM_FB-1:0] fill_ext_done,
+ input logic [NUM_FB-1:0][LINE_BEATS_W:0] fill_rvd_cnt_q,
+ input logic [NUM_FB-1:0] fill_rvd_done,
+ input logic [NUM_FB-1:0][LINE_BEATS_W:0] fill_out_cnt_q,
+ input logic [NUM_FB-1:0] fill_out_done,
+ input logic [NUM_FB-1:0] fill_ext_req,
+ input logic [NUM_FB-1:0] fill_rvd_exp,
+ input logic [NUM_FB-1:0] fill_data_sel,
+ input logic [NUM_FB-1:0] fill_data_reg,
+ input logic [NUM_FB-1:0][LINE_BEATS_W-1:0] fill_ext_off,
+ input logic [NUM_FB-1:0][LINE_BEATS_W:0] fill_rvd_beat,
+ input logic [NUM_FB-1:0] fill_out_arb,
+ input logic [NUM_FB-1:0] fill_rvd_arb,
+ input logic [NUM_FB-1:0][LINE_BEATS-1:0] fill_err_q,
+ input logic skid_valid_q,
+
+ input logic [NUM_FB-1:0][ADDR_W-1:0] packed_fill_addr_q
+);
+
+ logic [ADDR_W-1:0] line_step;
+ assign line_step = {{ADDR_W-LINE_W-1{1'b0}},1'b1,{LINE_W{1'b0}}};
+
+ // We are bound into the DUT. This means we don't control the clock and reset directly, but we
+ // still want to constrain rst_ni to reset the module at the start of time (for one cycle) and
+ // then stay high.
+ //
+ // Note that having a cycle with rst_ni low at the start of time means that we can safely use
+ // $past, $rose and $fell in calls to `ASSERT without any need for an "f_past_valid signal": they
+ // will only be evaluated from cycle 2 onwards.
+ logic [1:0] f_startup_count = 2'd0;
+ always_ff @(posedge clk_i) begin : reset_assertion
+ f_startup_count <= f_startup_count + ((f_startup_count == 2'd3) ? 2'd0 : 2'd1);
+
+ // Assume that rst_ni is low for the first cycle and not true after that.
+ assume (~((f_startup_count == 2'd0) ^ ~rst_ni));
+
+ // There is a feed-through path from branch_i to req_o which isn't squashed when in reset. Assume
+ // that branch_i isn't asserted when in reset.
+ assume (`IMPLIES(!rst_ni, !branch_i));
+ end
+
+ // Several of the protocol checks are only valid when there is a valid address. This is false
+ // after reset. It becomes true after any branch after reset and then false again on any returned
+ // error (because the straight-line address depends on the presumably-bogus rdata).
+ logic f_addr_valid;
+ always_ff @(posedge clk_i or negedge rst_ni) begin
+ if (!rst_ni) begin
+ f_addr_valid <= 1'b0;
+ end else begin
+ if (branch_i) begin
+ f_addr_valid = 1'b1;
+ end else if (valid_o & ready_i & err_o) begin
+ f_addr_valid = 1'b0;
+ end
+ end
+ end
+
+ // Protocol assumptions
+ //
+ // These are assumptions based on the top-level ports. They somewhat mirror the assertions in the
+ // simulation-based protocol checkers (see code in dv/uvm/icache/dv).
+
+ // Assumptions about the core
+ //
+ // The branch address must always be even
+ `ASSUME(even_address, `IMPLIES(branch_i, ~addr_i[0]))
+ // The branch_spec signal must be driven if branch is
+ `ASSUME(gate_bs, `IMPLIES(branch_i, branch_spec_i))
+ // Ready will not be asserted when req_i is low
+ `ASSUME(ready_implies_req_i, `IMPLIES(ready_i, req_i))
+
+ // Assumptions about the instruction bus
+ //
+ // The instruction bus is an in-order pipelined slave. We make requests with instr_req_o. If a
+ // request is granted (with instr_gnt_i), then the request has gone out. Sometime in the future,
+ // it will come back, but we can make other requests in the meantime. Requests are answered (in
+ // order) by the bus asserting instr_rvalid_i.
+ //
+ // Check that we won't get any spurious responses, using a simple counter of the requests on the
+ // bus.
+ logic [31:0] f_reqs_on_bus;
+ always_ff @(posedge clk_i or negedge rst_ni) begin
+ if (!rst_ni) begin
+ f_reqs_on_bus <= 32'd0;
+ end else begin
+ if (instr_req_o & (instr_gnt_i & ~instr_pmp_err_i) & ~instr_rvalid_i)
+ f_reqs_on_bus <= f_reqs_on_bus + 32'd1;
+ else if (~(instr_req_o & instr_gnt_i & ~instr_pmp_err_i) & instr_rvalid_i)
+ f_reqs_on_bus <= f_reqs_on_bus - 32'd1;
+ end
+ end
+ `ASSUME(no_rvalid_without_pending_req, `IMPLIES(instr_rvalid_i, f_reqs_on_bus != 32'd0))
+
+ // Assume the bus doesn't grant a request which would make the counter wrap (needed for inductive
+ // proofs). Since the counter allows (1<<32)-1 requests, this shouldn't be an unreasonable
+ // assumption!
+ `ASSUME(no_gnt_when_bus_full, ~instr_gnt_i | ~&f_reqs_on_bus);
+
+
+ // Top-level assertions
+ //
+ // This section contains the assertions that prove the properties we care about. All should be
+ // about observable signals (so shouldn't contain any references to anything that isn't exposed as
+ // an input port).
+
+ // REQ stays high until GNT
+ //
+ // If instr_req_o goes high, we won't drive it low again until instr_gnt_i or instr_pmp_err_i is
+ // high (the latter signals that the outgoing request got squashed, so we can forget about it).
+ //
+ // Read this as "a negedge of instr_req_o implies that the transaction was granted or squashed on
+ // the previous cycle".
+ `ASSERT(req_to_gnt,
+ `IMPLIES($fell(instr_req_o), $past(instr_gnt_i | instr_pmp_err_i)))
+
+ // ADDR stability
+ //
+ // If instr_req_o goes high, the address at instr_addr_o will stay constant until the request is
+ // squashed or granted. The encoding below says "either the address is stable, the request has
+ // been squashed, we've had a grant or this is a new request".
+ `ASSERT(req_addr_stable,
+ $stable(instr_addr_o) | $past(instr_gnt_i | instr_pmp_err_i | ~instr_req_o))
+
+ // VALID until READY
+ //
+ // The handshake isn't quite standard, because the core can cancel it by signalling branch_i,
+ // redirecting the icache somewhere else. So we ask that once valid_o goes high, it won't be
+ // de-asserted unless either ready_i was high on the previous cycle (in which case, the icache
+ // sent an instruction to the core) or branch_i goes high.
+ //
+ // We also have no requirements on the valid/ready handshake if the address is unknown
+ // (!f_addr_valid).
+ `ASSERT(vld_to_rdy,
+ `IMPLIES(f_addr_valid & $fell(valid_o), $past(branch_i | ready_i)))
+
+ // ADDR stability
+ `ASSERT(addr_stable,
+ `IMPLIES(f_addr_valid & $past(valid_o & ~(ready_i | branch_i)), $stable(addr_o)))
+
+ // ERR stability
+ `ASSERT(err_stable,
+ `IMPLIES(f_addr_valid & $past(valid_o & ~(ready_i | branch_i)), $stable(err_o)))
+ `ASSERT(err_plus2_stable,
+ `IMPLIES(f_addr_valid & $past(valid_o & err_o & ~(ready_i | branch_i)), $stable(err_plus2_o)))
+
+ // ERR_PLUS2 implies uncompressed
+ `ASSERT(err_plus_2_implies_uncompressed,
+ `IMPLIES(valid_o & err_o & err_plus2_o, rdata_o[1:0] == 2'b11))
+
+ // RDATA stability
+ //
+ // If valid_o is true and err_o is false, the bottom 16 bits of rdata_o will stay constant until
+ // the core takes the data by asserting ready_i, or until the core branches or de-asserts req_i.
+ `ASSERT(rdata_stable_lo,
+ `IMPLIES(f_addr_valid & ~err_o & $past(valid_o & ~(ready_i | branch_i)),
+ $stable(rdata_o[15:0])))
+ `ASSERT(rdata_stable_hi,
+ `IMPLIES(f_addr_valid & ~err_o &
+ $past(valid_o & ~(ready_i | branch_i)) & (rdata_o[1:0] == 2'b11),
+ $stable(rdata_o[31:16])))
+
+ // Formal coverage points
+ //
+ // See a good result returned by the cache
+ `COVER(fetch_good_result, f_addr_valid & valid_o & ready_i & ~branch_i & ~err_o)
+
+ // See a bad result returned by the cache
+ `COVER(fetch_bad_result, f_addr_valid & valid_o & ready_i & ~branch_i & err_o)
+
+ // See a bad result for the upper word returned by the cache
+ `COVER(fetch_bad_result_2, f_addr_valid & valid_o & ready_i & ~branch_i & err_o & err_plus2_o)
+
+ // See 8 back-to-back fetches ("full throughput")
+ logic [31:0] f_b2b_counter;
+ always_ff @(posedge clk_i or negedge rst_ni) begin
+ if (!rst_ni) begin
+ f_b2b_counter <= 32'd0;
+ end else begin
+ if (valid_o & ready_i & ~branch_i) begin
+ f_b2b_counter <= f_b2b_counter + 32'd1;
+ end else begin
+ f_b2b_counter <= 32'd0;
+ end
+ end
+ end
+ `COVER(back_to_back_fetches, f_b2b_counter == 8);
+
+ // Internal (induction) assertions
+ //
+ // Code below this line can refer to internal signals of the DUT. The assertions shouldn't be
+ // needed for BMC checks, but will be required to constrain the state space used for k-induction.
+
+ for (genvar fb = 0; fb < NUM_FB; fb++) begin : g_fb_older_asserts
+ // If fill buffer i is busy then fill_older_q[i][j] means that that fill buffer j has an
+ // outstanding request which started before us (and should take precedence). We should check
+ // that this only happens if buffer j is indeed busy.
+ //
+ // fill_busy_q[i] -> fill_older_q[i][j] -> fill_busy_q[j]
+ //
+ // which we can encode as
+ //
+ // (fill_older_q[i][j] -> fill_busy_q[j]) | ~fill_busy_q[i]
+ // = (fill_busy_q[j] | ~fill_older_q[i][j]) | ~fill_busy_q[i]
+ //
+ // Grouping by j, we can rewrite this as:
+ `ASSERT(older_is_busy, &(fill_busy_q | ~fill_older_q[fb]) | ~fill_busy_q[fb])
+
+ // No fill buffer should ever think that it's older than itself
+ `ASSERT(older_anti_refl, !fill_older_q[fb][fb])
+
+ // The "older" relation should be anti-symmetric (a fill buffer can't be both older than, and
+ // younger than, another). This takes NUM_FB*(NUM_FB-1)/2 assertions, comparing each pair of
+ // buffers. Here, we do this by looping over the indices below fb.
+ //
+ // If I and J both think the other is older, then fill_older_q[I][J] and fill_older_q[J][I] will
+ // both be true. Check that doesn't happen.
+ for (genvar fb2 = 0; fb2 < fb; fb2++) begin : g_older_anti_symm_asserts
+ `ASSERT(older_anti_symm, ~(fill_older_q[fb][fb2] & fill_older_q[fb2][fb]))
+ end
+
+ // The older relation should be transitive (if i is older than j and j is older than k, then i
+ // is older than k). That is:
+ //
+ // (fill_busy_q[i] & fill_older_q[i][j]) ->
+ // (fill_busy_q[j] & fill_older_q[j][k]) ->
+ // (fill_busy_q[i] & fill_older_q[i][k])
+ //
+ // Note that the second fill_busy_q[i] holds trivially and fill_busy_q[j] holds because of
+ // order_is_busy, so this can be rewritten as:
+ //
+ // fill_busy_q[i] & fill_older_q[i][j] -> fill_older_q[j][k] -> fill_older_q[i][k]
+ //
+ // Converting A->B->C into (A&B)->C and then rewriting A->B as B|~A, this is equivalent to
+ //
+ // (fill_older_q[i][k] | ~fill_older_q[j][k]) | ~(fill_busy_q[i] & fill_older_q[i][j])
+ //
+ // Looping over i and j, we can simplify this as
+ //
+ // &(fill_older_q[i] | ~fill_older_q[j]) | ~(fill_busy_q[i] & fill_older_q[i][j])
+ //
+ for (genvar fb2 = 0; fb2 < NUM_FB; fb2++) begin : g_older_transitive_asserts
+ `ASSERT(older_transitive,
+ (&(fill_older_q[fb] | ~fill_older_q[fb2]) |
+ ~(fill_busy_q[fb] & fill_older_q[fb][fb2])))
+ end
+
+ // The older relation should be total. This is a bit finicky because of fill buffers that aren't
+ // currently busy. Specifically, we want
+ //
+ // i != j -> fill_busy_q[i] -> fill_busy_q[j] -> (fill_older_q[i][j] | fill_older_q[j][i])
+ //
+ for (genvar fb2 = 0; fb2 < fb; fb2++) begin : g_older_total_asserts
+ `ASSERT(older_total,
+ `IMPLIES(fill_busy_q[fb] & fill_busy_q[fb2],
+ fill_older_q[fb][fb2] | fill_older_q[fb2][fb]))
+ end
+ end
+
+ // Assertions about fill-buffer counters
+ for (genvar fb = 0; fb < NUM_FB; fb++) begin : g_fb_counter_asserts
+
+ // We should never have fill_ext_hold_q[fb] if fill_ext_cnt_q[fb] == LINE_BEATS (because we
+ // shouldn't have made a request after we filled up).
+ `ASSERT(no_fill_ext_hold_when_full,
+ `IMPLIES(fill_ext_hold_q[fb],
+ fill_ext_cnt_q[fb] < LINE_BEATS[LINE_BEATS_W:0]))
+
+ // Each fill buffer is supposed to make at most LINE_BEATS requests (once we've filled the
+ // buffer, we shouldn't be asking for more).
+ `ASSERT(no_fill_ext_req_when_full,
+ `IMPLIES(fill_ext_req[fb],
+ (fill_ext_cnt_q[fb] < LINE_BEATS[LINE_BEATS_W:0])))
+
+ for (genvar fb2 = 0; fb2 < NUM_FB; fb2++) begin : g_older_counter_asserts
+ // Because we make requests from the oldest fill buffer first, a fill buffer should only have
+ // made any requests if every older fill buffer is done.
+ `ASSERT(older_ext_ordering,
+ `IMPLIES((fill_busy_q[fb] &&
+ (fill_ext_cnt_q[fb] != '0) &&
+ fill_older_q[fb][fb2] &&
+ fill_busy_q[fb2]),
+ fill_ext_done[fb2]))
+
+ // Similarly, if J is older than I then we should see fill_rvd_done[J] before
+ // fill_rvd_cnt_q[I] is nonzero.
+ `ASSERT(older_rvd_ordering,
+ `IMPLIES((fill_busy_q[fb] &&
+ (fill_rvd_cnt_q[fb] != '0) &&
+ fill_older_q[fb][fb2] &&
+ fill_busy_q[fb2]),
+ fill_rvd_done[fb2]))
+ end
+
+ // Tying together f_reqs_on_bus (the testbench request tracking) with the outstanding request
+ // count implied by fill_ext_cnt_q and fill_rvd_cnt_q.
+ //
+ // We expect the number of outstanding requests to be the sum of
+ //
+ // fill_rvd_cnt_q[fb] - fill_ext_cnt_q[fb]
+ //
+ // over all busy fill buffers.
+ logic [31:0] f_rvd_wo_ext_cnt;
+ always_comb begin
+ f_rvd_wo_ext_cnt = 32'd0;
+ for (int i = 0; i < NUM_FB; i++) begin
+ if (fill_busy_q[i])
+ f_rvd_wo_ext_cnt += {{32-(LINE_BEATS_W+1){1'b0}}, fill_ext_cnt_q[i] - fill_rvd_cnt_q[i]};
+ end
+ end
+ `ASSERT(rvd_minus_ext_cnt, f_rvd_wo_ext_cnt == f_reqs_on_bus);
+
+ // We have to make a request before we get a response, so no fill buffer should have more
+ // responses than requests.
+ `ASSERT(fill_rvd_le_ext, fill_rvd_cnt_q[fb] <= fill_ext_cnt_q[fb])
+
+ // When data comes back from the instruction bus, it will be assigned to the oldest fill buffer
+ // that still expects to receive some. This is correct because we always make requests from the
+ // oldest fill buffer (and the instruction bus answers in-order).
+
+ // We should never expect to receive beats of data unless fill_rvd_cnt_q is less than
+ // fill_ext_cnt_q. Note that fill_rvd_exp can be true in this situation, but fill_rvd_arb
+ // shouldn't be.
+ `ASSERT(rvd_arb_implies_ext_ahead,
+ `IMPLIES(fill_rvd_arb[fb], fill_rvd_cnt_q[fb] < fill_ext_cnt_q[fb]))
+
+ // Similarly, each fill buffer expects to receive at most LINE_BEATS responses
+ `ASSERT(no_fill_rvd_exp_when_full,
+ `IMPLIES(fill_rvd_exp[fb], fill_rvd_cnt_q[fb] < LINE_BEATS[LINE_BEATS_W:0]))
+
+ // There are several signals per fb which must be at most equal to LINE_BEATS, but they are
+ // stored with $clog2(LINE_BEATS_W) + 1 bits, so the signals can represent much bigger numbers.
+`define ASSERT_MAX_LINE_BEATS(name) \
+ `ASSERT(name``_max, name[fb] <= LINE_BEATS[LINE_BEATS_W:0])
+
+ `ASSERT_MAX_LINE_BEATS(fill_ext_cnt_q)
+ `ASSERT_MAX_LINE_BEATS(fill_rvd_cnt_q)
+ `ASSERT_MAX_LINE_BEATS(fill_out_cnt_q)
+
+ // All fill-buffer addresses should be half-word aligned. This won't quite be true after an
+ // error (because the prefetch address can get messed up).
+ `ASSERT(fb_addr_hw_aligned,
+ `IMPLIES(f_addr_valid & fill_busy_q[fb] & ~fill_stale_q[fb],
+ ~packed_fill_addr_q[fb][0]))
+
+ // The output counter shouldn't run ahead of the rvd counter unless there was a cache hit or an
+ // error. Note that the received counter counts from zero, whereas the output counter starts at
+ // the first beat to come out. We can adjust by using fill_rvd_beat, which starts at the first
+ // beat (like fill_out_cnt_q).
+ `ASSERT(fill_out_le_rvd,
+ `IMPLIES(fill_busy_q[fb] & ~fill_stale_q[fb] & ~branch_i,
+ fill_hit_q[fb] ||
+ |fill_err_q[fb] ||
+ (fill_out_cnt_q[fb] <= fill_rvd_beat[fb])))
+ end
+
+ // The prefetch address is the next address to prefetch. It should always be at least half-word
+ // aligned (it's populated by addr_i and then gets aligned to line boundaries afterwards)
+ `ASSERT(prefetch_addr_hw_aligned, `IMPLIES(f_addr_valid, ~prefetch_addr_q[0]))
+
+ // Define an analogue of fill_older_q, but only for buffers that are busy, not stale and think
+ // they have more data to return.
+ logic [NUM_FB-1:0] f_has_output;
+ logic [NUM_FB-1:0][NUM_FB-1:0] f_older_with_output, f_younger_with_output;
+ always_comb begin
+ f_has_output = '0;
+ f_older_with_output = '0;
+ for (int i = 0; i < NUM_FB; i++) begin
+ f_has_output[i] = fill_busy_q[i] & ~fill_stale_q[i] & ~fill_out_done[i];
+ end
+ for (int i = 0; i < NUM_FB; i++) begin
+ for (int j = 0; j < NUM_FB; j++) begin
+ f_older_with_output[i][j] = f_has_output[i] & f_has_output[j] & fill_older_q[i][j];
+ f_younger_with_output[j][i] = f_older_with_output[i][j];
+ end
+ end
+ end
+
+ // Find the oldest busy, non-stale fill buffer that doesn't think it's finished returning data.
+ // This is the one that should be outputting data. Grab its index and various associated
+ // addresses. Similarly with the youngest.
+ int unsigned f_oldest_fb, f_youngest_fb;
+ logic [ADDR_W-1:0] f_oldest_fill_addr_q, f_youngest_fill_addr_q;
+ logic [LINE_BEATS_W:0] f_oldest_fill_out_cnt_q;
+ always_comb begin
+ f_oldest_fb = NUM_FB;
+ f_youngest_fb = NUM_FB;
+ f_oldest_fill_addr_q = '0;
+ f_oldest_fill_out_cnt_q = '0;
+ for (int i = 0; i < NUM_FB; i++) begin
+ if (f_has_output[i] & ~|(f_older_with_output[i])) begin
+ f_oldest_fb = i;
+ f_oldest_fill_addr_q = packed_fill_addr_q[i];
+ f_oldest_fill_out_cnt_q = fill_out_cnt_q[i];
+ end
+ if (f_has_output[i] & ~|(f_younger_with_output[i])) begin
+ f_youngest_fb = i;
+ f_youngest_fill_addr_q = packed_fill_addr_q[i];
+ end
+ end
+ end
+
+ logic [ADDR_W-1:0] f_oldest_fill_line_start, f_youngest_fill_line_start;
+ assign f_oldest_fill_line_start = {f_oldest_fill_addr_q[ADDR_W-1:LINE_W], {LINE_W{1'b0}}};
+ assign f_youngest_fill_line_start = {f_youngest_fill_addr_q[ADDR_W-1:LINE_W], {LINE_W{1'b0}}};
+
+ // Suppose we have at least one fill buffer with data that needs outputting. Consider the oldest
+ // such fill buffer (f_oldest_fb). Data flows as follows:
+ //
+ // fill buffer -> (skid buffer) -> output
+ //
+ // We always read from a 4-byte chunk of the fill buffer, whose "read address"
+ // (f_oldest_fill_beat_start) is
+ //
+ // f_oldest_fill_line_start + 4 * f_oldest_fill_out_cnt_q
+ //
+ // The interaction with the skid buffer is a little complicated. Here are the possible scenarios,
+ // where the 1st two columns (addr_o, skid_valid_q) determine the other two (f_skidded_addr,
+ // f_skidded_beat_addr).
+ //
+ // | addr_o | skid | notional | line start |
+ // | mod 4 | valid | read addr | + out cnt |
+ // |--------+-------+-----------+------------|
+ // | 0 | 0 | 0 | 0 |
+ // | 0 | 1 | 2 | 0 |
+ // | 2 | 0 | 2 | 0 |
+ // | 2 | 1 | 4 | 4 |
+ //
+ // These checks are ignored if the address is invalid (because of an error) or on the cycle where
+ // a branch comes in.
+ logic [ADDR_W-1:0] f_skidded_addr;
+ logic [ADDR_W-1:0] f_beat_addr;
+ logic [ADDR_W-1:0] f_skidded_beat_addr;
+ assign f_skidded_addr = addr_o + 2 * {{ADDR_W-1{1'b0}}, skid_valid_q};
+ assign f_beat_addr = {addr_o[ADDR_W-1:2], 2'b00};
+ assign f_skidded_beat_addr = {f_skidded_addr[ADDR_W-1:2], 2'b00};
+
+ logic [ADDR_W-1:0] f_oldest_fill_beat_start;
+ assign f_oldest_fill_beat_start = (f_oldest_fill_line_start +
+ {{ADDR_W-LINE_BEATS_W-3{1'b0}},
+ f_oldest_fill_out_cnt_q, 2'b00});
+
+ `ASSERT(oldest_fb_addr,
+ `IMPLIES((f_oldest_fb < NUM_FB) && f_addr_valid && ~branch_i,
+ f_oldest_fill_beat_start == f_skidded_beat_addr))
+
+ // One other property that should hold for the oldest FB (only really relevant for branch targets)
+ // is that fill_addr_q <= f_skidded_addr. This avoids the model getting into a state where
+ // the counters think we fetched the top half of a line first (because fill_addr_q is something
+ // like 0x4) and that we're now reading the lower half, but addr_o is something like 0x2 and
+ // return results trash the output data.
+ `ASSERT(oldest_fb_addr_low,
+ `IMPLIES((f_oldest_fb < NUM_FB) && f_addr_valid && ~branch_i,
+ f_oldest_fill_addr_q <= f_skidded_addr))
+
+ // Track the address of the first beat for each fill buffer and its bottom address
+ logic [NUM_FB-1:0][ADDR_W-1:0] f_fill_beat_addr_q, f_fill_line_addr_q;
+ always_comb begin
+ f_fill_beat_addr_q = '0;
+ f_fill_line_addr_q = '0;
+ for (int i = 0; i < NUM_FB; i++) begin
+ f_fill_beat_addr_q[i] = {packed_fill_addr_q[i][ADDR_W-1:BUS_W], {BUS_W{1'b0}}};
+ f_fill_line_addr_q[i] = {packed_fill_addr_q[i][ADDR_W-1:LINE_W], {LINE_W{1'b0}}};
+ end
+ end
+
+ for (genvar fb = 0; fb < NUM_FB; fb++) begin : g_fb_addr_asserts
+ for (genvar fb2 = 0; fb2 < NUM_FB; fb2++) begin : g_fb_addr_asserts2
+ // We've checked that older is a total ordering on busy fill buffers (see older_total and the
+ // assertions immediately preceding it). That means it's also a total ordering on non-stale
+ // busy fill buffers (taking a subset of a total order doesn't change the fact it's total). We
+ // want to check if fb I immediately precedes fb J then J's address should immediately follow
+ // I's line.
+ //
+ // "fb is older than fb2 and nothing else comes in between" can be phrased as
+ // "fill_older_q[fb2] is exactly equal to fill_older_q[fb] plus the bit for fb" (note that
+ // older_anti_refl implies that won't be set). That would be:
+ //
+ // fill_older_q[fb2] == fill_older_q[fb] | ({{NUM_FB-1{1'b0}}, 1'b1} << fb)
+ //
+ // Since we are only interested in FBs that have more output data to write, we use
+ // f_older_with_output instead of fill_older_q.
+ `ASSERT(chained_fb_addr,
+ `IMPLIES((f_older_with_output[fb2] ==
+ (f_older_with_output[fb] | ({{NUM_FB-1{1'b0}}, 1'b1} << fb))),
+ packed_fill_addr_q[fb2] == f_fill_line_addr_q[fb] + line_step))
+ end
+
+ // If there is an older buffer than this one which has data to be output, this fill buffer's
+ // output count should be zero.
+ `ASSERT(younger_out_cnt_zero, `IMPLIES(|f_older_with_output[fb], fill_out_cnt_q[fb] == '0))
+ end
+
+ // Just as we check chaining between adjacent fill buffers' addresses, we expect prefetch_addr_q
+ // (used for the next fetch) to be the line after the youngest fill buffer.
+ `ASSERT(chained_prefetch_addr,
+ `IMPLIES(f_youngest_fb < NUM_FB,
+ prefetch_addr_q == f_youngest_fill_line_start + line_step))
+
+ // The output address can never be aligned when skid_valid is true. The state machine looks like
+ // this:
+ //
+ // | State | Next states |
+ // | (misaligned; skid_valid) | |
+ // |--------------------------+----------------------------|
+ // | (0; 0) | (0; 0) (uc instr) |
+ // | | (1; 1) (cmp instr) |
+ // | (1; 0) | (0; 0) (cmp instr) |
+ // | | (1; 1) (uc instr 1st half) |
+ // | (1; 1) | (0; 0) (cmp instr) |
+ // | | (1; 1) (uc instr) |
+ //
+ // The 1st two states are possible after branches. There's no arc to (0; 1).
+ `ASSERT(misaligned_when_skid_valid, `IMPLIES(skid_valid_q, addr_o[1]))
+
+ // If no fill buffers are waiting to output data, prefetch_addr_q holds the address that will be
+ // assigned to the next available fill buffer. This should equal the skidded version of addr_o
+ // (both addresses will have recently been set by a branch).
+ `ASSERT(no_fb_prefetch_addr,
+ `IMPLIES(f_addr_valid && (f_oldest_fb == NUM_FB), prefetch_addr_q == f_skidded_addr))
+
+ // fill_out_arb should be 1-hot
+ `ASSERT(fill_out_arb_one_hot, `IS_ONE_HOT(fill_out_arb, NUM_FB))
+
+ // fill_data_sel is not 1-hot, but it is when restricted to busy fill buffers that are not stale
+ // or done.
+ `ASSERT(fill_data_sel_one_hot,
+ `IS_ONE_HOT(fill_data_sel & fill_busy_q & ~fill_out_done & ~fill_stale_q, NUM_FB))
+
+ // fill_data_reg is based off of fill_data_sel and *should* be 1-hot (used for muxing)
+ `ASSERT(fill_data_reg_one_hot, `IS_ONE_HOT(fill_data_reg, NUM_FB))
+
+
+ // We can derive masks of beats that we think we have requested and received. fill_ext_cnt_q[fb]
+ // (fill_rvd_cnt_q[fb]) are the number of beats that we've requested (received). We started at
+ // beat fill_addr_q[fb][LINE_W-1:BUS_W].
+ //
+ // To make it easy to track what's going on, we define auxiliary signals. f_fill_first_beat[fb] is
+ // the index of the first beat to be fetched for this fill buffer. This is non-zero if the fill
+ // buffer comes about from a branch and fill_addr_q[fb] starts after the first beat.
+ //
+ // f_fill_ext_end_beat[fb] (f_fill_rvd_end_beat[fb]) is the index of the first beat that hasn't
+ // been requested (received). This doesn't wrap around, so if LINE_BEATS is 2, we started at beat
+ // 1 and have fetched 2 beats then it will be 3 (not 1).
+ //
+ // With these in hand, you can define f_fill_ext_mask[fb] (f_fill_rvd_mask[fb]), which has a bit
+ // per beat, which is set if the corresponding data has been requested (received). Writing b for
+ // the beat in question, s for the start beat, c for the count, e for the end beat (without
+ // wrapping) and w for the log of the number of beats in a line, the check becomes:
+ //
+ // (c != 0) && ((s <= b) ? (b < e) : (b + (1 << w) < e))
+ // = (c != 0) && (e > ((s <= b) ? b : (b + (1 << w))))
+ // = (c != 0) && (e > (b + ((s <= b) ? 0 : (1 << w))))
+ // = (c != 0) && (e > (b + (((s <= b) ? 0 : 1) << w)))
+ // = (c != 0) && (e > (b + ((s > b) << w)))
+
+ logic [NUM_FB-1:0][LINE_BEATS_W-1:0] f_fill_first_beat;
+ logic [NUM_FB-1:0][LINE_BEATS_W:0] f_fill_ext_end_beat, f_fill_rvd_end_beat;
+ logic [NUM_FB-1:0][LINE_BEATS-1:0] f_fill_ext_mask, f_fill_rvd_mask;
+
+ always_comb begin
+ f_fill_first_beat = '0;
+ f_fill_ext_end_beat = '0;
+ f_fill_ext_mask = '0;
+ f_fill_rvd_end_beat = '0;
+ f_fill_rvd_mask = '0;
+ for (int i = 0; i < NUM_FB; i++) begin
+ f_fill_first_beat[i] = f_fill_beat_addr_q[i][LINE_W-1:BUS_W];
+ f_fill_ext_end_beat[i] = {1'b0, f_fill_first_beat[i]} + fill_ext_cnt_q[i];
+ f_fill_rvd_end_beat[i] = {1'b0, f_fill_first_beat[i]} + fill_rvd_cnt_q[i];
+ for (int b = 0; b < LINE_BEATS; b++) begin
+ f_fill_ext_mask[i][b] = ((|fill_ext_cnt_q[i]) &&
+ (f_fill_ext_end_beat[i] >
+ (b[LINE_BEATS_W:0] +
+ {f_fill_first_beat[i] > b[LINE_BEATS_W-1:0],
+ {LINE_BEATS_W{1'b0}}})));
+ f_fill_rvd_mask[i][b] = (|fill_rvd_cnt_q[i] &&
+ (f_fill_rvd_end_beat[i] >
+ (b[LINE_BEATS_W:0] +
+ {f_fill_first_beat[i] > b[LINE_BEATS_W-1:0],
+ {LINE_BEATS_W{1'b0}}})));
+ end
+ end
+ end
+
+ // Now we have that mask, we can assert that we only have fill errors on beats that we have
+ // fetched. That's not quite true (because of PMP errors). So what we really want to say is:
+ //
+ // If beat b has an error then either we have received data for beat b or we tried to get data
+ // for beat b and the memory request was squashed by a PMP error.
+ //
+ // The former case is easy (bit b should be set in f_fill_rvd_mask). In the latter case,
+ // fill_ext_done will be true, fill_ext_cnt_q will be less than LINE_BEATS, and fill_ext_off (the
+ // next beat to fetch) will equal b. We define explicit masks for the bits allowed in each case.
+ logic [NUM_FB-1:0][LINE_BEATS-1:0] f_rvd_err_mask, f_pmp_err_mask, f_err_mask;
+ always_comb begin
+ f_rvd_err_mask = '0;
+ f_pmp_err_mask = '0;
+ for (int i = 0; i < NUM_FB; i++) begin
+ f_rvd_err_mask[i] = f_fill_rvd_mask[i];
+ for (int b = 0; b < LINE_BEATS; b++) begin
+ f_pmp_err_mask[i][b] = (fill_ext_done[i] &&
+ !fill_ext_cnt_q[i][LINE_BEATS_W] &&
+ (fill_ext_off[i] == b[LINE_BEATS_W-1:0]));
+ end
+ end
+ end
+ assign f_err_mask = f_rvd_err_mask | f_pmp_err_mask;
+
+ for (genvar fb = 0; fb < NUM_FB; fb++) begin : g_fb_error_beat_asserts
+ `ASSERT(err_is_recv_or_pmp,
+ `IMPLIES(fill_busy_q[fb], ~|(fill_err_q[fb] & ~f_err_mask[fb])))
+ end
+
+ // If there is data in the skid buffer, it either came from the previous line (and addr_o is the
+ // top of that line and f_skidded_addr is the start of our line) or it came from this line. In the
+ // latter case, we must have received the data that's in the skid buffer.
+ //
+ // If there is no valid fill buffer at the moment, any skid buffer data must be from the previous
+ // line.
+ `ASSERT(skid_is_rvd_wo_buffer,
+ `IMPLIES((f_oldest_fb == NUM_FB) && f_addr_valid && skid_valid_q,
+ f_skidded_addr[LINE_W-1:0] == '0))
+
+ // If there is a valid fill buffer and the skidded address isn't the start of the line then we
+ // must either have received the beat of data the skid buffer came from, that beat should have an
+ // associated error or we must have had a cache hit.
+ `ASSERT(skid_is_rvd_with_buffer,
+ `IMPLIES(((f_oldest_fb < NUM_FB) && f_addr_valid &&
+ skid_valid_q && (f_skidded_addr[LINE_W-1:0] != '0)),
+ f_fill_rvd_mask[f_oldest_fb][f_beat_addr[LINE_W-1:BUS_W]] |
+ fill_err_q[f_oldest_fb][f_beat_addr[LINE_W-1:BUS_W]] |
+ fill_hit_q[f_oldest_fb]))
+
+
+endmodule
diff --git a/hw/vendor/lowrisc_ibex/formal/icache/formal_tb_frag.svh b/hw/vendor/lowrisc_ibex/formal/icache/formal_tb_frag.svh
new file mode 100644
index 0000000..22886f0
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/formal/icache/formal_tb_frag.svh
@@ -0,0 +1,26 @@
+// Copyright lowRISC contributors.
+// Licensed under the Apache License, Version 2.0, see LICENSE for details.
+// SPDX-License-Identifier: Apache-2.0
+
+// A fragment of SystemVerilog code that is inserted into the ICache. We're using this to emulate
+// missing bind support, so this file should do nothing but instantiate a module.
+//
+// Using a wildcard (.*) for ports allows the testbench to inspect internal signals of the cache.
+
+formal_tb #(
+ .BusWidth (BusWidth),
+ .CacheSizeBytes (CacheSizeBytes),
+ .ICacheECC (ICacheECC),
+ .LineSize (LineSize),
+ .NumWays (NumWays),
+ .SpecRequest (SpecRequest),
+ .BranchCache (BranchCache),
+
+ .ADDR_W (ADDR_W),
+ .NUM_FB (NUM_FB),
+ .LINE_W (LINE_W),
+ .BUS_BYTES (BUS_BYTES),
+ .BUS_W (BUS_W),
+ .LINE_BEATS (LINE_BEATS),
+ .LINE_BEATS_W (LINE_BEATS_W)
+) tb_i (.*);
diff --git a/hw/vendor/lowrisc_ibex/formal/icache/ibex_icache_fpv.core b/hw/vendor/lowrisc_ibex/formal/icache/ibex_icache_fpv.core
new file mode 100644
index 0000000..b1cda08
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/formal/icache/ibex_icache_fpv.core
@@ -0,0 +1,73 @@
+CAPI=2:
+# Copyright lowRISC contributors.
+# Licensed under the Apache License, Version 2.0, see LICENSE for details.
+# SPDX-License-Identifier: Apache-2.0
+
+name: "lowrisc:fpv:ibex_icache_fpv:0.1"
+description: "Formal properties for Ibex ICache"
+
+filesets:
+ all:
+ depend:
+ - lowrisc:ibex:ibex_icache
+ - lowrisc:prim:assert
+ files:
+ - run.sby : {file_type: sbyConfig}
+ - formal_tb_frag.svh : {file_type: systemVerilogSource, is_include_file: true}
+ - formal_tb.sv : {file_type: systemVerilogSource}
+ - sv2v_in_place.py : { copyto: sv2v_in_place.py }
+
+scripts:
+ sv2v_in_place:
+ cmd:
+ - python3
+ - sv2v_in_place.py
+ - --incdir-list=incdirs.txt
+ # A bit of a hack: The primitives directory (vendored from OpenTitan)
+ # contains SystemVerilog code that has proper SVA assertions, using
+ # things like the |-> operator.
+ #
+ # The Yosys-style prim_assert.sv assertions are immediate, rather than
+ # concurrent. Such assertions only allow expressions (not full property
+ # specifiers), which cause a syntax error if you try to use them with
+ # the assertions in the primitives directory.
+ #
+ # Since we don't care about those assertions here, we want to strip
+ # them out. The code that selects an assertion backend in
+ # prim_assert.sv doesn't have an explicit "NO_ASSERTIONS" mode, but
+ # "SYNTHESIS" implies the same thing, so we use that.
+ - --define-if=prim:SYNTHESIS
+ - -DYOSYS
+ - -DFORMAL
+ - -v
+ - files.txt
+
+parameters:
+ ICacheECC:
+ datatype: int
+ default: 0
+ paramtype: vlogparam
+ description: "Enable ECC protection in instruction cache"
+
+targets:
+ prove: &prove
+ parameters:
+ - ICacheECC
+ hooks:
+ pre_build:
+ - sv2v_in_place
+ filesets:
+ - all
+ toplevel: ibex_icache
+ default_tool: symbiyosys
+ tools:
+ symbiyosys:
+ tasknames:
+ - prove
+
+ cover:
+ <<: *prove
+ tools:
+ symbiyosys:
+ tasknames:
+ - cover
diff --git a/hw/vendor/lowrisc_ibex/formal/icache/run.sby b/hw/vendor/lowrisc_ibex/formal/icache/run.sby
new file mode 100644
index 0000000..3505275
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/formal/icache/run.sby
@@ -0,0 +1,27 @@
+# Copyright lowRISC contributors.
+# Licensed under the Apache License, Version 2.0, see LICENSE for details.
+# SPDX-License-Identifier: Apache-2.0
+
+[tasks]
+prove pf
+cover cv
+
+[options]
+pf: mode prove
+pf: depth 3
+
+cv: mode cover
+cv: depth 32
+
+[engines]
+smtbmc boolector
+
+[script]
+read -sv @INPUT@
+
+# Our formal properties are currently just about control logic, which
+# isn't affected by the exact behaviour of the memories in the design.
+# Blackbox them.
+blackbox $abstract\prim_generic_ram_1p
+
+prep -top ibex_icache
diff --git a/hw/vendor/lowrisc_ibex/formal/icache/sv2v_in_place.py b/hw/vendor/lowrisc_ibex/formal/icache/sv2v_in_place.py
new file mode 100644
index 0000000..b8394a9
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/formal/icache/sv2v_in_place.py
@@ -0,0 +1,190 @@
+#!/usr/bin/env python3
+# Copyright lowRISC contributors.
+# Licensed under the Apache License, Version 2.0, see LICENSE for details.
+# SPDX-License-Identifier: Apache-2.0
+
+import argparse
+import logging
+import os
+import re
+import shlex
+import shutil
+import subprocess
+import tempfile
+from typing import List, Pattern, Tuple
+
+
+def read_file_list(path: str) -> List[str]:
+ '''Read in a list of paths from a file, one per line.'''
+ ret = []
+ with open(path) as handle:
+ for line in handle:
+ ret.append(line.strip())
+ return ret
+
+
+def transform_one(sv2v: str,
+ defines: List[str],
+ incdirs: List[str],
+ pkg_paths: List[str],
+ sv_path: str,
+ dst_path: str) -> None:
+ '''Run sv2v to edit a file in place'''
+ defines_args = ['--define=' + d for d in defines]
+ incdirs_args = ['--incdir=' + d for d in incdirs]
+ paths = pkg_paths + ([] if sv_path in pkg_paths else [sv_path])
+
+ cmd = ([sv2v,
+ # Pass --exclude=assert to tell sv2v not to strip out assertions.
+ # Since the whole point of this flow is to prove assertions, we
+ # need to leave them unscathed!
+ '--exclude=assert'] +
+ defines_args +
+ incdirs_args +
+ paths)
+ logging.info('Running sv2v on {}'.format(sv_path))
+ logging.debug('Command: {}'.format(cmd))
+ with open(dst_path, 'w') as dst_file:
+ proc = subprocess.run(cmd, stdout=dst_file)
+ if proc.returncode != 0:
+ cmd_str = ' '.join([shlex.quote(a) for a in cmd])
+ raise RuntimeError('Failed to run sv2v on {}. '
+ 'Exit code: {}. Full command: {}'
+ .format(sv_path, proc.returncode, cmd_str))
+
+
+def parse_define_if(arg: str) -> Tuple[Pattern[str], str]:
+ '''Handle a --define-if argument'''
+ parts = arg.rsplit(':', 1)
+ if len(parts) != 2:
+ msg = ('The --define-if argument {!r} contains no colon. The correct '
+ 'syntax is "--define-if regex:define".'
+ .format(arg))
+ raise argparse.ArgumentTypeError(msg)
+
+ re_str, define = parts
+ try:
+ return (re.compile(re_str), define)
+ except re.error as err:
+ raise argparse.ArgumentTypeError('The regex for the --define-if '
+ 'argument ({!r}) is malformed: {}.'
+ .format(re_str, err))
+
+
+def transform(sv2v: str,
+ defines: List[str],
+ defines_if: List[Tuple[Pattern[str], str]],
+ incdirs: List[str],
+ pkg_paths: List[str],
+ sv_paths: List[str]) -> None:
+ '''Run sv2v to transform a list of files in-place'''
+ with tempfile.TemporaryDirectory() as tmpdir:
+ # First write each file to a file in a temporary directory, then copy
+ # everything back. We have to do it like this because otherwise we
+ # might trash a file that needs to be included by a later one.
+ dst_paths = []
+ for idx, src_path in enumerate(sv_paths):
+ dst_path = os.path.join(tmpdir, str(idx))
+
+ extra_file_defines = []
+ for regex, define in defines_if:
+ if regex.search(src_path):
+ extra_file_defines.append(define)
+
+ transform_one(sv2v, defines + extra_file_defines,
+ incdirs, pkg_paths, src_path, dst_path)
+ dst_paths.append(dst_path)
+
+ # Now copy everything back, overwriting the original code
+ for dst_path, src_path in zip(dst_paths, sv_paths):
+ shutil.copy(dst_path, src_path)
+
+
+def main() -> int:
+ parser = argparse.ArgumentParser()
+ parser.add_argument('file_list',
+ help=('File containing a list of '
+ 'paths on which to work.'))
+ parser.add_argument('--verbose', '-v', action='store_true',
+ help="Log messages about what we're doing.")
+ parser.add_argument('--define', '-D', action='append', dest='defines',
+ default=[],
+ help='Add a preprocessor define.')
+ parser.add_argument('--define-if', action='append',
+ dest='defines_if', type=parse_define_if, default=[],
+ help=('Add a preprocessor define which applies to '
+ 'specific files. For example '
+ '--define-if=foo:bar would define `bar on any '
+ 'files whose paths contained a match for the '
+ 'regex "foo".'))
+ parser.add_argument('--incdir', '-I', action='append', dest='incdirs',
+ default=[],
+ help='Add an include dir for the preprocessor.')
+ parser.add_argument('--incdir-list',
+ help=('Specify a file containing a list of include '
+ 'directories (which are appended to any defined '
+ 'through the --incdir argument).'))
+ parser.add_argument('--sv2v',
+ default='sv2v',
+ help=("Specify the name or path of the sv2v binary. "
+ "Defaults to 'sv2v'."))
+
+ args = parser.parse_args()
+
+ if args.verbose:
+ logging.basicConfig(level=logging.INFO)
+
+ try:
+ logging.info('Reading file list from {!r}.'.format(args.file_list))
+ paths = read_file_list(args.file_list)
+ except IOError:
+ logging.error('Failed to read file list from {!r}'
+ .format(args.file_list))
+ return 1
+
+ if args.incdir_list is not None:
+ try:
+ logging.info('Reading incdir list from {!r}.'
+ .format(args.incdir_list))
+ args.incdirs += read_file_list(args.incdir_list)
+ except IOError:
+ logging.error('Failed to read incdir list from {!r}'
+ .format(args.file_list))
+ return 1
+
+ # Find all .sv or .svh files, splitting out paths ending in "pkg.sv"
+ # specially. We treat these as packages, which are included in each sv2v
+ # conversion.
+ sv_paths = []
+ svh_paths = []
+ pkg_paths = []
+ for path in paths:
+ if os.path.splitext(path)[1] == '.sv':
+ sv_paths.append(path)
+ if os.path.splitext(path)[1] == '.svh':
+ svh_paths.append(path)
+ if path.endswith('pkg.sv'):
+ pkg_paths.append(path)
+
+ logging.info('Running sv2v in-place on {} files ({} packages).'
+ .format(len(sv_paths), len(pkg_paths)))
+
+ try:
+ transform(args.sv2v, args.defines, args.defines_if, args.incdirs,
+ pkg_paths, sv_paths)
+ except RuntimeError as err:
+ logging.error(err)
+ return 1
+
+ # Empty out any remaining .svh files: they should have been included by
+ # this point (sv2v includes a preprocessor).
+ logging.info('Splatting contents of {} .svh files.'.format(len(svh_paths)))
+ for path in svh_paths:
+ with open(path, 'w'):
+ pass
+
+ return 0
+
+
+if __name__ == '__main__':
+ exit(main())
diff --git a/hw/vendor/lowrisc_ibex/formal/Makefile b/hw/vendor/lowrisc_ibex/formal/riscv-formal/Makefile
similarity index 100%
rename from hw/vendor/lowrisc_ibex/formal/Makefile
rename to hw/vendor/lowrisc_ibex/formal/riscv-formal/Makefile
diff --git a/hw/vendor/lowrisc_ibex/formal/README.md b/hw/vendor/lowrisc_ibex/formal/riscv-formal/README.md
similarity index 81%
rename from hw/vendor/lowrisc_ibex/formal/README.md
rename to hw/vendor/lowrisc_ibex/formal/riscv-formal/README.md
index b19988e..b1007b5 100644
--- a/hw/vendor/lowrisc_ibex/formal/README.md
+++ b/hw/vendor/lowrisc_ibex/formal/riscv-formal/README.md
@@ -29,8 +29,8 @@
Run the following command from the top level directory of Ibex to create the Verilog source.
```console
-make -C formal
+make -C formal/riscv-formal
```
-This will create a directory *formal/build* which contains an equivalent Verilog file for each SystemVerilog source.
-The single output file *formal/ibex.v* contains the complete Ibex source, which can then be imported by riscv-formal.
+This will create a directory *formal/riscv-formal/build* which contains an equivalent Verilog file for each SystemVerilog source.
+The single output file *formal/riscv-formal/ibex.v* contains the complete Ibex source, which can then be imported by riscv-formal.
diff --git a/hw/vendor/lowrisc_ibex/ibex_configs.yaml b/hw/vendor/lowrisc_ibex/ibex_configs.yaml
index 6929140..ed0913c 100644
--- a/hw/vendor/lowrisc_ibex/ibex_configs.yaml
+++ b/hw/vendor/lowrisc_ibex/ibex_configs.yaml
@@ -10,7 +10,7 @@
small:
RV32E : 0
RV32M : 1
- RV32B : 0
+ RV32B : "ibex_pkg::RV32BNone"
BranchTargetALU : 0
WritebackStage : 0
MultiplierImplementation : "fast"
@@ -28,7 +28,7 @@
experimental-maxperf:
RV32E : 0
RV32M : 1
- RV32B : 0
+ RV32B : "ibex_pkg::RV32BNone"
BranchTargetALU : 1
WritebackStage : 1
MultiplierImplementation : "single-cycle"
@@ -40,7 +40,7 @@
experimental-maxperf-pmp:
RV32E : 0
RV32M : 1
- RV32B : 0
+ RV32B : "ibex_pkg::RV32BNone"
BranchTargetALU : 1
WritebackStage : 1
MultiplierImplementation : "single-cycle"
@@ -48,14 +48,27 @@
PMPGranularity : 0
PMPNumRegions : 16
-# experimental-maxperf-pmp config above with bitmanip extension
-experimental-maxperf-pmp-bm:
+# experimental-maxperf-pmp config above with balanced bitmanip extension
+experimental-maxperf-pmp-bmbalanced:
RV32E : 0
RV32M : 1
- RV32B : 1
+ RV32B : "ibex_pkg::RV32BBalanced"
BranchTargetALU : 1
WritebackStage : 1
MultiplierImplementation : "single-cycle"
PMPEnable : 1
PMPGranularity : 0
PMPNumRegions : 16
+
+# experimental-maxperf-pmp config above with full bitmanip extension
+experimental-maxperf-pmp-bmfull:
+ RV32E : 0
+ RV32M : 1
+ RV32B : "ibex_pkg::RV32BFull"
+ BranchTargetALU : 1
+ WritebackStage : 1
+ MultiplierImplementation : "single-cycle"
+ PMPEnable : 1
+ PMPGranularity : 0
+ PMPNumRegions : 16
+
diff --git a/hw/vendor/lowrisc_ibex/ibex_core.core b/hw/vendor/lowrisc_ibex/ibex_core.core
index 0c45f59..dce96f4 100644
--- a/hw/vendor/lowrisc_ibex/ibex_core.core
+++ b/hw/vendor/lowrisc_ibex/ibex_core.core
@@ -3,15 +3,14 @@
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
name: "lowrisc:ibex:ibex_core:0.1"
-description: "CPU core with 2 stage pipeline implementing the RV32IMC_Zicsr_Zifencei ISA"
+description: "Ibex, a small RV32 CPU core"
filesets:
files_rtl:
depend:
- lowrisc:prim:assert
- # TODO: Only lfsr is needed. Replace with a more specific dependency
- # once available.
- - lowrisc:prim:all
+ - lowrisc:prim:clock_gating
+ - lowrisc:prim:lfsr
- lowrisc:ibex:ibex_pkg
- lowrisc:ibex:ibex_icache
files:
@@ -40,14 +39,14 @@
- rtl/ibex_core.sv
file_type: systemVerilogSource
- files_lint:
- depend:
- - lowrisc:ibex:sim_shared
-
files_lint_verilator:
files:
- lint/verilator_waiver.vlt: {file_type: vlt}
+ files_lint_verible:
+ files:
+ - lint/verible_waiver.vbw: {file_type: veribleLintWaiver}
+
files_check_tool_requirements:
depend:
- lowrisc:tool:check_tool_requirements
@@ -72,9 +71,10 @@
paramtype: vlogparam
RV32B:
- datatype: int
- default: 0
- paramtype: vlogparam
+ datatype: str
+ default: ibex_pkg::RV32BNone
+ paramtype: vlogdefine
+ description: "Bitmanip implementation parameter enum. See ibex_pkg.sv (EXPERIMENTAL)"
MultiplierImplementation:
datatype: str
@@ -131,22 +131,19 @@
description: "Number of PMP regions"
targets:
- default:
+ default: &default_target
filesets:
- tool_verilator ? (files_lint_verilator)
+ - tool_veriblelint ? (files_lint_verible)
- files_rtl
- files_check_tool_requirements
+ toplevel: ibex_core
lint:
- filesets:
- - tool_verilator ? (files_lint_verilator)
- - files_rtl
- - files_lint
- - files_check_tool_requirements
+ <<: *default_target
parameters:
- SYNTHESIS=true
- RVFI=true
default_tool: verilator
- toplevel: ibex_core
tools:
verilator:
mode: lint-only
@@ -155,10 +152,6 @@
# RAM primitives wider than 64bit (required for ECC) fail to build in
# Verilator without increasing the unroll count (see Verilator#1266)
- "--unroll-count 72"
- veriblelint:
- ruleset: default
- rules:
- - "-parameter-name-style"
format:
filesets:
- files_rtl
diff --git a/hw/vendor/lowrisc_ibex/ibex_core_tracing.core b/hw/vendor/lowrisc_ibex/ibex_core_tracing.core
index 619436b..f568cb1 100644
--- a/hw/vendor/lowrisc_ibex/ibex_core_tracing.core
+++ b/hw/vendor/lowrisc_ibex/ibex_core_tracing.core
@@ -3,7 +3,7 @@
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
name: "lowrisc:ibex:ibex_core_tracing:0.1"
-description: "Ibex CPU core with tracing enabled"
+description: "Ibex, a small RV32 CPU core with tracing enabled"
filesets:
files_rtl:
depend:
@@ -13,14 +13,6 @@
- rtl/ibex_core_tracing.sv
file_type: systemVerilogSource
- files_lint:
- depend:
- - lowrisc:ibex:sim_shared
-
- files_lint_verilator:
- files:
- - lint/verilator_waiver.vlt: {file_type: vlt}
-
parameters:
# The tracer uses the RISC-V Formal Interface (RVFI) to collect trace signals.
RVFI:
@@ -43,9 +35,10 @@
paramtype: vlogparam
RV32B:
- datatype: int
- default: 0
- paramtype: vlogparam
+ datatype: str
+ default: ibex_pkg::RV32BNone
+ paramtype: vlogdefine
+ description: "Bitmanip implementation parameter enum. See ibex_pkg.sv (EXPERIMENTAL)"
MultiplierImplementation:
datatype: str
@@ -102,20 +95,15 @@
description: "Number of PMP regions"
targets:
- default:
+ default: &default_target
filesets:
- files_rtl
parameters:
- RVFI=true
+ toplevel: ibex_core_tracing
lint:
- filesets:
- # Note on Verilator waivers:
- # You *must* include the waiver file first, otherwise only global waivers
- # are applied, but not file-specific waivers.
- - tool_verilator ? (files_lint_verilator)
- - files_rtl
- - files_lint
+ <<: *default_target
parameters:
- RVFI=true
- SYNTHESIS=true
@@ -130,7 +118,6 @@
- PMPGranularity
- PMPNumRegions
default_tool: verilator
- toplevel: ibex_core_tracing
tools:
verilator:
mode: lint-only
@@ -139,10 +126,6 @@
# RAM primitives wider than 64bit (required for ECC) fail to build in
# Verilator without increasing the unroll count (see Verilator#1266)
- "--unroll-count 72"
- veriblelint:
- ruleset: default
- rules:
- - "-parameter-name-style"
format:
filesets:
- files_rtl
diff --git a/hw/vendor/lowrisc_ibex/ibex_icache.core b/hw/vendor/lowrisc_ibex/ibex_icache.core
index b3fee9c..c093ba2 100644
--- a/hw/vendor/lowrisc_ibex/ibex_icache.core
+++ b/hw/vendor/lowrisc_ibex/ibex_icache.core
@@ -3,12 +3,13 @@
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
name: "lowrisc:ibex:ibex_icache:0.1"
-description: "IBEX_ICACHE DV sim target"
+description: "Ibex instruction cache"
filesets:
files_rtl:
depend:
- lowrisc:prim:secded
- lowrisc:prim:ram_1p
+ - lowrisc:prim:assert
files:
- rtl/ibex_icache.sv
file_type: systemVerilogSource
diff --git a/hw/vendor/lowrisc_ibex/lint/verible_waiver.vbw b/hw/vendor/lowrisc_ibex/lint/verible_waiver.vbw
new file mode 100644
index 0000000..bc0cfee
--- /dev/null
+++ b/hw/vendor/lowrisc_ibex/lint/verible_waiver.vbw
@@ -0,0 +1 @@
+waive --rule=module-filename --regex=".*" --location="ibex_register_file_.+"
diff --git a/hw/vendor/lowrisc_ibex/lint/verilator_waiver.vlt b/hw/vendor/lowrisc_ibex/lint/verilator_waiver.vlt
index ee041ae..8049d65 100644
--- a/hw/vendor/lowrisc_ibex/lint/verilator_waiver.vlt
+++ b/hw/vendor/lowrisc_ibex/lint/verilator_waiver.vlt
@@ -37,12 +37,18 @@
// cleaner to write all bits even if not all are used
lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'shift_result_ext'[32]*"
-// Signal is not used for RV32B == 0: imd_val_q_i
+// Signal is not used for RV32B == RV32BNone: imd_val_q_i
//
// No ALU multicycle instructions exist to use the intermediate value register,
// if bitmanipulation extension is not enabled.
lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'imd_val_q_i'"
+// Signal is not used for RV32B == RV32BNone: butterfly_result, invbutterfly_result
+//
+// Need to be declared; referenced in unused if-generate block
+lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'butterfly_result'"
+lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'invbutterfly_result'"
+
// Bits of signal are not used: fetch_addr_n[0]
// cleaner to write all bits even if not all are used
lint_off -rule UNUSED -file "*/rtl/ibex_if_stage.sv" -match "*'fetch_addr_n'[0]*"
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_alu.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_alu.sv
index 0f59328..b72f344 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_alu.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_alu.sv
@@ -7,7 +7,7 @@
* Arithmetic logic unit
*/
module ibex_alu #(
- parameter bit RV32B = 1'b0
+ parameter ibex_pkg::rv32b_e RV32B = ibex_pkg::RV32BNone
) (
input ibex_pkg::alu_op_e operator_i,
input logic [31:0] operand_a_i,
@@ -20,9 +20,9 @@
input logic multdiv_sel_i,
- input logic [31:0] imd_val_q_i,
- output logic [31:0] imd_val_d_o,
- output logic imd_val_we_o,
+ input logic [31:0] imd_val_q_i[2],
+ output logic [31:0] imd_val_d_o[2],
+ output logic [1:0] imd_val_we_o,
output logic [31:0] adder_result_o,
output logic [33:0] adder_result_ext_o,
@@ -241,16 +241,16 @@
logic [31:0] bfp_result;
// bfp: shares the shifter structure to compute bfp_mask << bfp_off
- assign bfp_op = RV32B ? (operator_i == ALU_BFP) : 1'b0;
+ assign bfp_op = (RV32B != RV32BNone) ? (operator_i == ALU_BFP) : 1'b0;
assign bfp_len = {~(|operand_b_i[27:24]), operand_b_i[27:24]}; // len = 0 encodes for len = 16
assign bfp_off = operand_b_i[20:16];
- assign bfp_mask = RV32B ? ~(32'hffff_ffff << bfp_len) : '0;
+ assign bfp_mask = (RV32B != RV32BNone) ? ~(32'hffff_ffff << bfp_len) : '0;
for (genvar i=0; i<32; i++) begin : gen_rev_bfp_mask
assign bfp_mask_rev[i] = bfp_mask[31-i];
end
- assign bfp_result =
- RV32B ? (~shift_result & operand_a_i) | ((operand_b_i & bfp_mask) << bfp_off) : '0;
+ assign bfp_result =(RV32B != RV32BNone) ?
+ (~shift_result & operand_a_i) | ((operand_b_i & bfp_mask) << bfp_off) : '0;
// bit shift_amt[5]: word swap bit: only considered for FSL/FSR.
// if set, reverse operations in first and second cycle.
@@ -267,9 +267,8 @@
end
end
-
// single-bit mode: shift
- assign shift_sbmode = RV32B ?
+ assign shift_sbmode = (RV32B != RV32BNone) ?
(operator_i == ALU_SBSET) | (operator_i == ALU_SBCLR) | (operator_i == ALU_SBINV) : 1'b0;
// left shift if this is:
@@ -284,13 +283,13 @@
unique case (operator_i)
ALU_SLL: shift_left = 1'b1;
ALU_SLO,
- ALU_BFP: shift_left = RV32B ? 1'b1 : 1'b0;
- ALU_ROL: shift_left = RV32B ? instr_first_cycle_i : 0;
- ALU_ROR: shift_left = RV32B ? ~instr_first_cycle_i : 0;
- ALU_FSL: shift_left =
- RV32B ? (shift_amt[5] ? ~instr_first_cycle_i : instr_first_cycle_i) : 1'b0;
- ALU_FSR: shift_left =
- RV32B ? (shift_amt[5] ? instr_first_cycle_i : ~instr_first_cycle_i) : 1'b0;
+ ALU_BFP: shift_left = (RV32B != RV32BNone) ? 1'b1 : 1'b0;
+ ALU_ROL: shift_left = (RV32B != RV32BNone) ? instr_first_cycle_i : 0;
+ ALU_ROR: shift_left = (RV32B != RV32BNone) ? ~instr_first_cycle_i : 0;
+ ALU_FSL: shift_left = (RV32B != RV32BNone) ?
+ (shift_amt[5] ? ~instr_first_cycle_i : instr_first_cycle_i) : 1'b0;
+ ALU_FSR: shift_left = (RV32B != RV32BNone) ?
+ (shift_amt[5] ? instr_first_cycle_i : ~instr_first_cycle_i) : 1'b0;
default: shift_left = 1'b0;
endcase
if (shift_sbmode) begin
@@ -298,26 +297,26 @@
end
end
- assign shift_arith = (operator_i == ALU_SRA);
- assign shift_ones = RV32B ? (operator_i == ALU_SLO) | (operator_i == ALU_SRO) : 1'b0;
- assign shift_funnel = RV32B ? (operator_i == ALU_FSL) | (operator_i == ALU_FSR) : 1'b0;
+ assign shift_arith = (operator_i == ALU_SRA);
+ assign shift_ones =
+ (RV32B != RV32BNone) ? (operator_i == ALU_SLO) | (operator_i == ALU_SRO) : 1'b0;
+ assign shift_funnel =
+ (RV32B != RV32BNone) ? (operator_i == ALU_FSL) | (operator_i == ALU_FSR) : 1'b0;
// shifter structure.
always_comb begin
-
// select shifter input
// for bfp, sbmode and shift_left the corresponding bit-reversed input is chosen.
- if (shift_sbmode) begin
- shift_result = 32'h8000_0000; // rev(32'h1)
+ if (RV32B == RV32BNone) begin
+ shift_result = shift_left ? operand_a_rev : operand_a_i;
end else begin
unique case (1'b1)
bfp_op: shift_result = bfp_mask_rev;
- shift_left: shift_result = operand_a_rev;
- default: shift_result = operand_a_i;
+ shift_sbmode: shift_result = 32'h8000_0000;
+ default: shift_result = shift_left ? operand_a_rev : operand_a_i;
endcase
end
-
shift_result_ext =
$signed({shift_ones | (shift_arith & shift_result[31]), shift_result}) >>> shift_amt[4:0];
@@ -350,8 +349,8 @@
// Logic-with-negate OPs (RV32B Ops)
ALU_XNOR,
ALU_ORN,
- ALU_ANDN: bwlogic_op_b_negate = RV32B ? 1'b1 : 1'b0;
- ALU_CMIX: bwlogic_op_b_negate = RV32B ? ~instr_first_cycle_i : 1'b0;
+ ALU_ANDN: bwlogic_op_b_negate = (RV32B != RV32BNone) ? 1'b1 : 1'b0;
+ ALU_CMIX: bwlogic_op_b_negate = (RV32B != RV32BNone) ? ~instr_first_cycle_i : 1'b0;
default: bwlogic_op_b_negate = 1'b0;
endcase
end
@@ -373,19 +372,19 @@
endcase
end
+ logic [5:0] bitcnt_result;
+ logic [31:0] minmax_result;
+ logic [31:0] pack_result;
+ logic [31:0] sext_result;
+ logic [31:0] singlebit_result;
+ logic [31:0] rev_result;
logic [31:0] shuffle_result;
logic [31:0] butterfly_result;
logic [31:0] invbutterfly_result;
-
- logic [31:0] minmax_result;
- logic [5:0] bitcnt_result;
- logic [31:0] pack_result;
- logic [31:0] sext_result;
- logic [31:0] multicycle_result;
- logic [31:0] singlebit_result;
logic [31:0] clmul_result;
+ logic [31:0] multicycle_result;
- if (RV32B) begin : g_alu_rvb
+ if (RV32B != RV32BNone) begin : g_alu_rvb
/////////////////
// Bitcounting //
@@ -404,6 +403,8 @@
logic [31:0] bitcnt_mask_op;
logic [31:0] bitcnt_bit_mask;
logic [ 5:0] bitcnt_partial [32];
+ logic [31:0] bitcnt_partial_lsb_d;
+ logic [31:0] bitcnt_partial_msb_d;
assign bitcnt_ctz = operator_i == ALU_CTZ;
@@ -427,6 +428,8 @@
bitcnt_bit_mask = ~bitcnt_bit_mask;
end
+ assign zbe_op = (operator_i == ALU_BEXT) | (operator_i == ALU_BDEP);
+
always_comb begin
case(1'b1)
zbe_op: bitcnt_bits = operand_b_i;
@@ -518,523 +521,11 @@
end
///////////////
- // Butterfly //
- ///////////////
-
- // The butterfly / inverse butterfly network is shared between bext/bdep (zbe)instructions
- // respectively and grev / gorc instructions (zbp).
- // For bdep, the control bits mask of a local left region is generated by
- // the inverse of a n-bit left rotate and complement upon wrap (LROTC) operation by the number
- // of ones in the deposit bitmask to the right of the segment. n hereby denotes the width
- // of the according segment. The bitmask for a pertaining local right region is equal to the
- // corresponding local left region. Bext uses an analogue inverse process.
- // Consider the following 8-bit example. For details, see Hilewitz et al. "Fast Bit Gather,
- // Bit Scatter and Bit Permuation Instructions for Commodity Microprocessors", (2008).
-
- // 8-bit example: (Hilewitz et al.)
- // Consider the instruction bdep operand_a_i deposit_mask
- // Let operand_a_i = 8'babcd_efgh
- // deposit_mask = 8'b1010_1101
- //
- // control bitmask for stage 1:
- // - number of ones in the right half of the deposit bitmask: 3
- // - width of the segment: 4
- // - control bitmask = ~LROTC(4'b0, 3)[3:0] = 4'b1000
- //
- // control bitmask: c3 c2 c1 c0 c3 c2 c1 c0
- // 1 0 0 0 1 0 0 0
- // <- L -----> <- R ----->
- // operand_a_i a b c d e f g h
- // :\ | | | /: | | |
- // : +|---|--|-+ : | | |
- // :/ | | | \: | | |
- // stage 1 e b c d a f g h
- // <L-> <R-> <L-> <R->
- // control bitmask: c3 c2 c3 c2 c1 c0 c1 c0
- // 1 1 1 1 1 0 1 0
- // :\ :\ /: /: :\ | /: |
- // : +:-+-:+ : : +|-+ : |
- // :/ :/ \: \: :/ | \: |
- // stage 2 c d e b g f a h
- // L R L R L R L R
- // control bitmask: c3 c3 c2 c2 c1 c1 c0 c0
- // 1 1 0 0 1 1 0 0
- // :\/: | | :\/: | |
- // : : | | : : | |
- // :/\: | | :/\: | |
- // stage 3 d c e b f g a h
- // & deposit bitmask: 1 0 1 0 1 1 0 1
- // result: d 0 e 0 f g 0 h
-
- assign zbe_op = (operator_i == ALU_BEXT) | (operator_i == ALU_BDEP);
-
- logic [31:0] butterfly_mask_l[5];
- logic [31:0] butterfly_mask_r[5];
- logic [31:0] butterfly_mask_not[5];
- logic [31:0] lrotc_stage [5]; // left rotate and complement upon wrap
-
- // bext / bdep
- logic [31:0] butterfly_zbe_mask_l[5];
- logic [31:0] butterfly_zbe_mask_r[5];
- logic [31:0] butterfly_zbe_mask_not[5];
-
- // grev / gorc
- logic [31:0] butterfly_zbp_mask_l[5];
- logic [31:0] butterfly_zbp_mask_r[5];
- logic [31:0] butterfly_zbp_mask_not[5];
-
- logic grev_op;
- logic gorc_op;
- logic zbp_op;
-
- // number of bits in local r = 32 / 2**(stage + 1) = 16/2**stage
- `define _N(stg) (16 >> stg)
-
- // bext / bdep control bit generation
- for (genvar stg=0; stg<5; stg++) begin : gen_stage
- // number of segs: 2** stg
- for (genvar seg=0; seg<2**stg; seg++) begin : gen_segment
-
- assign lrotc_stage[stg][2*`_N(stg)*(seg+1)-1 : 2*`_N(stg)*seg] =
- {{`_N(stg){1'b0}},{`_N(stg){1'b1}}} <<
- bitcnt_partial[`_N(stg)*(2*seg+1)-1][$clog2(`_N(stg)):0];
-
- assign butterfly_zbe_mask_l[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)]
- = ~lrotc_stage[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)];
-
- assign butterfly_zbe_mask_r[stg][`_N(stg)*(2*seg+1)-1 : `_N(stg)*(2*seg)]
- = ~lrotc_stage[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)];
-
- assign butterfly_zbe_mask_l[stg][`_N(stg)*(2*seg+1)-1 : `_N(stg)*(2*seg)] = '0;
- assign butterfly_zbe_mask_r[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)] = '0;
- end
- end
- `undef _N
-
- for (genvar stg=0; stg<5; stg++) begin : gen_zbe_mask
- assign butterfly_zbe_mask_not[stg] =
- ~(butterfly_zbe_mask_l[stg] | butterfly_zbe_mask_r[stg]);
- end
-
- // grev / gorc control bit generation
- assign butterfly_zbp_mask_l[0] = shift_amt[4] ? 32'hffff_0000 : 32'h0000_0000;
- assign butterfly_zbp_mask_r[0] = shift_amt[4] ? 32'h0000_ffff : 32'h0000_0000;
- assign butterfly_zbp_mask_not[0] =
- !shift_amt[4] || (shift_amt[4] && gorc_op) ? 32'hffff_ffff : 32'h0000_0000;
-
- assign butterfly_zbp_mask_l[1] = shift_amt[3] ? 32'hff00_ff00 : 32'h0000_0000;
- assign butterfly_zbp_mask_r[1] = shift_amt[3] ? 32'h00ff_00ff : 32'h0000_0000;
- assign butterfly_zbp_mask_not[1] =
- !shift_amt[3] || (shift_amt[3] && gorc_op) ? 32'hffff_ffff : 32'h0000_0000;
-
- assign butterfly_zbp_mask_l[2] = shift_amt[2] ? 32'hf0f0_f0f0 : 32'h0000_0000;
- assign butterfly_zbp_mask_r[2] = shift_amt[2] ? 32'h0f0f_0f0f : 32'h0000_0000;
- assign butterfly_zbp_mask_not[2] =
- !shift_amt[2] || (shift_amt[2] && gorc_op) ? 32'hffff_ffff : 32'h0000_0000;
-
- assign butterfly_zbp_mask_l[3] = shift_amt[1] ? 32'hcccc_cccc : 32'h0000_0000;
- assign butterfly_zbp_mask_r[3] = shift_amt[1] ? 32'h3333_3333 : 32'h0000_0000;
- assign butterfly_zbp_mask_not[3] =
- !shift_amt[1] || (shift_amt[1] && gorc_op) ? 32'hffff_ffff : 32'h0000_0000;
-
- assign butterfly_zbp_mask_l[4] = shift_amt[0] ? 32'haaaa_aaaa : 32'h0000_0000;
- assign butterfly_zbp_mask_r[4] = shift_amt[0] ? 32'h5555_5555 : 32'h0000_0000;
- assign butterfly_zbp_mask_not[4] =
- !shift_amt[0] || (shift_amt[0] && gorc_op) ? 32'hffff_ffff : 32'h0000_0000;
-
- // grev / gorc instructions
- assign grev_op = RV32B ? (operator_i == ALU_GREV) : 1'b0;
- assign gorc_op = RV32B ? (operator_i == ALU_GORC) : 1'b0;
- assign zbp_op = grev_op | gorc_op;
-
- // select set of masks:
- assign butterfly_mask_l = zbp_op ? butterfly_zbp_mask_l : butterfly_zbe_mask_l;
- assign butterfly_mask_r = zbp_op ? butterfly_zbp_mask_r : butterfly_zbe_mask_r;
- assign butterfly_mask_not = zbp_op ? butterfly_zbp_mask_not : butterfly_zbe_mask_not;
-
- always_comb begin
- butterfly_result = operand_a_i;
-
- butterfly_result = butterfly_result & butterfly_mask_not[0] |
- ((butterfly_result & butterfly_mask_l[0]) >> 16)|
- ((butterfly_result & butterfly_mask_r[0]) << 16);
-
- butterfly_result = butterfly_result & butterfly_mask_not[1] |
- ((butterfly_result & butterfly_mask_l[1]) >> 8)|
- ((butterfly_result & butterfly_mask_r[1]) << 8);
-
- butterfly_result = butterfly_result & butterfly_mask_not[2] |
- ((butterfly_result & butterfly_mask_l[2]) >> 4)|
- ((butterfly_result & butterfly_mask_r[2]) << 4);
-
- butterfly_result = butterfly_result & butterfly_mask_not[3] |
- ((butterfly_result & butterfly_mask_l[3]) >> 2)|
- ((butterfly_result & butterfly_mask_r[3]) << 2);
-
- butterfly_result = butterfly_result & butterfly_mask_not[4] |
- ((butterfly_result & butterfly_mask_l[4]) >> 1)|
- ((butterfly_result & butterfly_mask_r[4]) << 1);
-
- if (!zbp_op) begin
- butterfly_result = butterfly_result & operand_b_i;
- end
- end
-
- always_comb begin
- invbutterfly_result = operand_a_i & operand_b_i;
-
- invbutterfly_result = invbutterfly_result & butterfly_mask_not[4] |
- ((invbutterfly_result & butterfly_mask_l[4]) >> 1)|
- ((invbutterfly_result & butterfly_mask_r[4]) << 1);
-
- invbutterfly_result = invbutterfly_result & butterfly_mask_not[3] |
- ((invbutterfly_result & butterfly_mask_l[3]) >> 2)|
- ((invbutterfly_result & butterfly_mask_r[3]) << 2);
-
- invbutterfly_result = invbutterfly_result & butterfly_mask_not[2] |
- ((invbutterfly_result & butterfly_mask_l[2]) >> 4)|
- ((invbutterfly_result & butterfly_mask_r[2]) << 4);
-
- invbutterfly_result = invbutterfly_result & butterfly_mask_not[1] |
- ((invbutterfly_result & butterfly_mask_l[1]) >> 8)|
- ((invbutterfly_result & butterfly_mask_r[1]) << 8);
-
- invbutterfly_result = invbutterfly_result & butterfly_mask_not[0] |
- ((invbutterfly_result & butterfly_mask_l[0]) >> 16)|
- ((invbutterfly_result & butterfly_mask_r[0]) << 16);
- end
-
- /////////////////////////
- // Shuffle / Unshuffle //
- /////////////////////////
-
- localparam logic [31:0] SHUFFLE_MASK_L [4] =
- '{32'h4444_4444, 32'h3030_3030, 32'h0f00_0f00, 32'h00ff_0000};
- localparam logic [31:0] SHUFFLE_MASK_R [4] =
- '{32'h2222_2222, 32'h0c0c_0c0c, 32'h00f0_00f0, 32'h0000_ff00};
-
- localparam logic [31:0] FLIP_MASK_L [4] =
- '{32'h1100_0000, 32'h4411_0000, 32'h0044_0000, 32'h2200_1100};
- localparam logic [31:0] FLIP_MASK_R [4] =
- '{32'h0000_0088, 32'h0000_8822, 32'h0000_2200, 32'h0088_0044};
-
- logic [31:0] SHUFFLE_MASK_NOT [4];
- for(genvar i = 0; i < 4; i++) begin : gen_shuffle_mask_not
- assign SHUFFLE_MASK_NOT[i] = ~(SHUFFLE_MASK_L[i] | SHUFFLE_MASK_R[i]);
- end
-
- logic shuffle_flip;
- assign shuffle_flip = operator_i == ALU_UNSHFL;
-
- logic [3:0] shuffle_mode;
-
- always_comb begin
- shuffle_result = operand_a_i;
-
- if (shuffle_flip) begin
- shuffle_mode[3] = shift_amt[0];
- shuffle_mode[2] = shift_amt[1];
- shuffle_mode[1] = shift_amt[2];
- shuffle_mode[0] = shift_amt[3];
- end else begin
- shuffle_mode = shift_amt[3:0];
- end
-
- if (shuffle_flip) begin
- shuffle_result = (shuffle_result & 32'h8822_4411) |
- ((shuffle_result << 6) & FLIP_MASK_L[0]) | ((shuffle_result >> 6) & FLIP_MASK_R[0]) |
- ((shuffle_result << 9) & FLIP_MASK_L[1]) | ((shuffle_result >> 9) & FLIP_MASK_R[1]) |
- ((shuffle_result << 15) & FLIP_MASK_L[2]) | ((shuffle_result >> 15) & FLIP_MASK_R[2]) |
- ((shuffle_result << 21) & FLIP_MASK_L[3]) | ((shuffle_result >> 21) & FLIP_MASK_R[3]);
- end
-
- if (shuffle_mode[3]) begin
- shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[0]) |
- (((shuffle_result << 8) & SHUFFLE_MASK_L[0]) |
- ((shuffle_result >> 8) & SHUFFLE_MASK_R[0]));
- end
- if (shuffle_mode[2]) begin
- shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[1]) |
- (((shuffle_result << 4) & SHUFFLE_MASK_L[1]) |
- ((shuffle_result >> 4) & SHUFFLE_MASK_R[1]));
- end
- if (shuffle_mode[1]) begin
- shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[2]) |
- (((shuffle_result << 2) & SHUFFLE_MASK_L[2]) |
- ((shuffle_result >> 2) & SHUFFLE_MASK_R[2]));
- end
- if (shuffle_mode[0]) begin
- shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[3]) |
- (((shuffle_result << 1) & SHUFFLE_MASK_L[3]) |
- ((shuffle_result >> 1) & SHUFFLE_MASK_R[3]));
- end
-
- if (shuffle_flip) begin
- shuffle_result = (shuffle_result & 32'h8822_4411) |
- ((shuffle_result << 6) & FLIP_MASK_L[0]) | ((shuffle_result >> 6) & FLIP_MASK_R[0]) |
- ((shuffle_result << 9) & FLIP_MASK_L[1]) | ((shuffle_result >> 9) & FLIP_MASK_R[1]) |
- ((shuffle_result << 15) & FLIP_MASK_L[2]) | ((shuffle_result >> 15) & FLIP_MASK_R[2]) |
- ((shuffle_result << 21) & FLIP_MASK_L[3]) | ((shuffle_result >> 21) & FLIP_MASK_R[3]);
- end
-
- end
- ///////////////////////////////////////////////////
- // Carry-less Multiply + Cyclic Redundancy Check //
- ///////////////////////////////////////////////////
-
- // Carry-less multiplication can be understood as multiplication based on
- // the addition interpreted as the bit-wise xor operation.
- //
- // Example: 1101 X 1011 = 1111111:
- //
- // 1011 X 1101
- // -----------
- // 1101
- // xor 1101
- // ---------
- // 10111
- // xor 0000
- // ----------
- // 010111
- // xor 1101
- // -----------
- // 1111111
- //
- // Architectural details:
- // A 32 x 32-bit array
- // [ operand_b[i] ? (operand_a << i) : '0 for i in 0 ... 31 ]
- // is generated. The entries of the array are pairwise 'xor-ed'
- // together in a 5-stage binary tree.
- //
- //
- // Cyclic Redundancy Check:
- //
- // CRC-32 (CRC-32/ISO-HDLC) and CRC-32C (CRC-32/ISCSI) are directly implemented. For
- // documentation of the crc configuration (crc-polynomials, initialization, reflection, etc.)
- // see http://reveng.sourceforge.net/crc-catalogue/all.htm
- // A useful guide to crc arithmetic and algorithms is given here:
- // http://www.piclist.com/techref/method/math/crcguide.html.
- //
- // The CRC operation solves the following equation using binary polynomial arithmetic:
- //
- // rev(rd)(x) = rev(rs1)(x) * x**n mod {1, P}(x)
- //
- // where P denotes lower 32 bits of the corresponding CRC polynomial, rev(a) the bit reversal
- // of a, n = 8,16, or 32 for .b, .h, .w -variants. {a, b} denotes bit concatenation.
- //
- // Using barret reduction, one can show that
- //
- // M(x) mod P(x) = R(x) =
- // (M(x) * x**n) & {deg(P(x)'{1'b1}}) ^ (M(x) x**-(deg(P(x) - n)) cx mu(x) cx P(x),
- //
- // Where mu(x) = polydiv(x**64, {1,P}) & 0xffffffff. Here, 'cx' refers to carry-less
- // multiplication. Substituting rev(rd)(x) for R(x) and rev(rs1)(x) for M(x) and solving for
- // rd(x) with P(x) a crc32 polynomial (deg(P(x)) = 32), we get
- //
- // rd = rev( (rev(rs1) << n) ^ ((rev(rs1) >> (32-n)) cx mu cx P)
- // = (rs1 >> n) ^ rev(rev( (rs1 << (32-n)) cx rev(mu)) cx P)
- // ^-- cycle 0--------------------^
- // ^- cycle 1 -------------------------------------------^
- //
- // In the last step we used the fact that carry-less multiplication is bit-order agnostic:
- // rev(a cx b) = rev(a) cx rev(b).
-
- logic clmul_rmode;
- logic clmul_hmode;
- logic [31:0] clmul_op_a;
- logic [31:0] clmul_op_b;
- logic [31:0] operand_b_rev;
- logic [31:0] clmul_and_stage[32];
- logic [31:0] clmul_xor_stage1[16];
- logic [31:0] clmul_xor_stage2[8];
- logic [31:0] clmul_xor_stage3[4];
- logic [31:0] clmul_xor_stage4[2];
-
- logic [31:0] clmul_result_raw;
- logic [31:0] clmul_result_rev;
-
- for (genvar i=0; i<32; i++) begin: gen_rev_operand_b
- assign operand_b_rev[i] = operand_b_i[31-i];
- end
-
- assign clmul_rmode = operator_i == ALU_CLMULR;
- assign clmul_hmode = operator_i == ALU_CLMULH;
-
- // CRC
- localparam logic [31:0] CRC32_POLYNOMIAL = 32'h04c1_1db7;
- localparam logic [31:0] CRC32_MU_REV = 32'hf701_1641;
-
- localparam logic [31:0] CRC32C_POLYNOMIAL = 32'h1edc_6f41;
- localparam logic [31:0] CRC32C_MU_REV = 32'hdea7_13f1;
-
- logic crc_op;
- logic crc_hmode;
- logic crc_bmode;
-
- logic crc_cpoly;
-
- logic [31:0] crc_operand;
- logic [31:0] crc_poly;
- logic [31:0] crc_mu_rev;
-
- assign crc_op = (operator_i == ALU_CRC32C_W) | (operator_i == ALU_CRC32_W) |
- (operator_i == ALU_CRC32C_H) | (operator_i == ALU_CRC32_H) |
- (operator_i == ALU_CRC32C_B) | (operator_i == ALU_CRC32_B);
-
- assign crc_cpoly = (operator_i == ALU_CRC32C_W) |
- (operator_i == ALU_CRC32C_H) |
- (operator_i == ALU_CRC32C_B);
-
- assign crc_hmode = (operator_i == ALU_CRC32_H) | (operator_i == ALU_CRC32C_H);
- assign crc_bmode = (operator_i == ALU_CRC32_B) | (operator_i == ALU_CRC32C_B);
-
- assign crc_poly = crc_cpoly ? CRC32C_POLYNOMIAL : CRC32_POLYNOMIAL;
- assign crc_mu_rev = crc_cpoly ? CRC32C_MU_REV : CRC32_MU_REV;
-
- always_comb begin
- unique case(1'b1)
- crc_bmode: crc_operand = {operand_a_i[7:0], 24'h0};
- crc_hmode: crc_operand = {operand_a_i[15:0], 16'h0};
- default: crc_operand = operand_a_i;
- endcase
- end
-
- // Select clmul input
- always_comb begin
- if (crc_op) begin
- clmul_op_a = instr_first_cycle_i ? crc_operand : imd_val_q_i;
- clmul_op_b = instr_first_cycle_i ? crc_mu_rev : crc_poly;
- end else begin
- clmul_op_a = clmul_rmode | clmul_hmode ? operand_a_rev : operand_a_i;
- clmul_op_b = clmul_rmode | clmul_hmode ? operand_b_rev : operand_b_i;
- end
- end
-
- for (genvar i=0; i<32; i++) begin : gen_clmul_and_op
- assign clmul_and_stage[i] = clmul_op_b[i] ? clmul_op_a << i : '0;
- end
-
- for (genvar i=0; i<16; i++) begin : gen_clmul_xor_op_l1
- assign clmul_xor_stage1[i] = clmul_and_stage[2*i] ^ clmul_and_stage[2*i+1];
- end
-
- for (genvar i=0; i<8; i++) begin : gen_clmul_xor_op_l2
- assign clmul_xor_stage2[i] = clmul_xor_stage1[2*i] ^ clmul_xor_stage1[2*i+1];
- end
-
- for (genvar i=0; i<4; i++) begin : gen_clmul_xor_op_l3
- assign clmul_xor_stage3[i] = clmul_xor_stage2[2*i] ^ clmul_xor_stage2[2*i+1];
- end
-
- for (genvar i=0; i<2; i++) begin : gen_clmul_xor_op_l4
- assign clmul_xor_stage4[i] = clmul_xor_stage3[2*i] ^ clmul_xor_stage3[2*i+1];
- end
-
- assign clmul_result_raw = clmul_xor_stage4[0] ^ clmul_xor_stage4[1];
-
- for (genvar i=0; i<32; i++) begin : gen_rev_clmul_result
- assign clmul_result_rev[i] = clmul_result_raw[31-i];
- end
-
- // clmulr_result = rev(clmul(rev(a), rev(b)))
- // clmulh_result = clmulr_result >> 1
- always_comb begin
- case(1'b1)
- clmul_rmode: clmul_result = clmul_result_rev;
- clmul_hmode: clmul_result = {1'b0, clmul_result_rev[31:1]};
- default: clmul_result = clmul_result_raw;
- endcase
- end
-
- //////////////////////////////////////
- // Multicycle Bitmanip Instructions //
- //////////////////////////////////////
- // Ternary instructions + Shift Rotations + CRC
- // For ternary instructions (zbt), operand_a_i is tied to rs1 in the first cycle and rs3 in the
- // second cycle. operand_b_i is always tied to rs2.
-
-
- always_comb begin
- unique case (operator_i)
- ALU_CMOV: begin
- imd_val_d_o = operand_a_i;
- multicycle_result = (operand_b_i == 32'h0) ? operand_a_i : imd_val_q_i;
- if (instr_first_cycle_i) begin
- imd_val_we_o = 1'b1;
- end else begin
- imd_val_we_o = 1'b0;
- end
- end
-
- ALU_CMIX: begin
- multicycle_result = imd_val_q_i | bwlogic_and_result;
- imd_val_d_o = bwlogic_and_result;
- if (instr_first_cycle_i) begin
- imd_val_we_o = 1'b1;
- end else begin
- imd_val_we_o = 1'b0;
- end
- end
-
- ALU_FSR, ALU_FSL,
- ALU_ROL, ALU_ROR: begin
- if (shift_amt[4:0] == 5'h0) begin
- multicycle_result = shift_amt[5] ? operand_a_i : imd_val_q_i;
- end else begin
- multicycle_result = imd_val_q_i | shift_result;
- end
- imd_val_d_o = shift_result;
- if (instr_first_cycle_i) begin
- imd_val_we_o = 1'b1;
- end else begin
- imd_val_we_o = 1'b0;
- end
- end
-
- ALU_CRC32_W, ALU_CRC32C_W,
- ALU_CRC32_H, ALU_CRC32C_H,
- ALU_CRC32_B, ALU_CRC32C_B: begin
- imd_val_d_o = clmul_result_rev;
- unique case(1'b1)
- crc_bmode: multicycle_result = clmul_result_rev ^ (operand_a_i >> 8);
- crc_hmode: multicycle_result = clmul_result_rev ^ (operand_a_i >> 16);
- default: multicycle_result = clmul_result_rev;
- endcase
- if (instr_first_cycle_i) begin
- imd_val_we_o = 1'b1;
- end else begin
- imd_val_we_o = 1'b0;
- end
- end
-
- default: begin
- imd_val_d_o = operand_a_i;
- imd_val_we_o = 1'b0;
- multicycle_result = operand_a_i;
- end
- endcase
- end
-
- /////////////////////////////
- // Single-bit Instructions //
- /////////////////////////////
-
- always_comb begin
- unique case (operator_i)
- ALU_SBSET: singlebit_result = operand_a_i | shift_result;
- ALU_SBCLR: singlebit_result = operand_a_i & ~shift_result;
- ALU_SBINV: singlebit_result = operand_a_i ^ shift_result;
- default: singlebit_result = {31'h0, shift_result[0]}; // ALU_SBEXT
- endcase
- end
-
- ///////////////
// Min / Max //
///////////////
assign minmax_result = cmp_result ? operand_a_i : operand_b_i;
-
//////////
// Pack //
//////////
@@ -1059,21 +550,623 @@
assign sext_result = (operator_i == ALU_SEXTB) ?
{ {24{operand_a_i[7]}}, operand_a_i[7:0]} : { {16{operand_a_i[15]}}, operand_a_i[15:0]};
+ /////////////////////////////
+ // Single-bit Instructions //
+ /////////////////////////////
+
+ always_comb begin
+ unique case (operator_i)
+ ALU_SBSET: singlebit_result = operand_a_i | shift_result;
+ ALU_SBCLR: singlebit_result = operand_a_i & ~shift_result;
+ ALU_SBINV: singlebit_result = operand_a_i ^ shift_result;
+ default: singlebit_result = {31'h0, shift_result[0]}; // ALU_SBEXT
+ endcase
+ end
+
+ ////////////////////////////////////
+ // General Reverse and Or-combine //
+ ////////////////////////////////////
+
+ // Only a subset of the General reverse and or-combine instructions are implemented in the
+ // balanced version of the B extension. Currently rev, rev8 and orc.b are supported in the
+ // base extension.
+
+ logic [4:0] zbp_shift_amt;
+ logic gorc_op;
+
+ assign gorc_op = (operator_i == ALU_GORC);
+ assign zbp_shift_amt[2:0] = (RV32B == RV32BFull) ? shift_amt[2:0] : {3{&shift_amt[2:0]}};
+ assign zbp_shift_amt[4:3] = (RV32B == RV32BFull) ? shift_amt[4:3] : {2{&shift_amt[4:3]}};
+
+ always_comb begin
+ rev_result = operand_a_i;
+
+ if (zbp_shift_amt[0]) begin
+ rev_result = (gorc_op ? rev_result : 32'h0) |
+ ((rev_result & 32'h5555_5555) << 1) |
+ ((rev_result & 32'haaaa_aaaa) >> 1);
+ end
+
+ if (zbp_shift_amt[1]) begin
+ rev_result = (gorc_op ? rev_result : 32'h0) |
+ ((rev_result & 32'h3333_3333) << 2) |
+ ((rev_result & 32'hcccc_cccc) >> 2);
+ end
+
+ if (zbp_shift_amt[2]) begin
+ rev_result = (gorc_op ? rev_result : 32'h0) |
+ ((rev_result & 32'h0f0f_0f0f) << 4) |
+ ((rev_result & 32'hf0f0_f0f0) >> 4);
+ end
+
+ if (zbp_shift_amt[3]) begin
+ rev_result = (gorc_op & (RV32B == RV32BFull) ? rev_result : 32'h0) |
+ ((rev_result & 32'h00ff_00ff) << 8) |
+ ((rev_result & 32'hff00_ff00) >> 8);
+ end
+
+ if (zbp_shift_amt[4]) begin
+ rev_result = (gorc_op & (RV32B == RV32BFull) ? rev_result : 32'h0) |
+ ((rev_result & 32'h0000_ffff) << 16) |
+ ((rev_result & 32'hffff_0000) >> 16);
+ end
+ end
+
+ logic crc_hmode;
+ logic crc_bmode;
+ logic [31:0] clmul_result_rev;
+
+ if (RV32B == RV32BFull) begin : gen_alu_rvb_full
+
+ /////////////////////////
+ // Shuffle / Unshuffle //
+ /////////////////////////
+
+ localparam logic [31:0] SHUFFLE_MASK_L [4] =
+ '{32'h00ff_0000, 32'h0f00_0f00, 32'h3030_3030, 32'h4444_4444};
+ localparam logic [31:0] SHUFFLE_MASK_R [4] =
+ '{32'h0000_ff00, 32'h00f0_00f0, 32'h0c0c_0c0c, 32'h2222_2222};
+
+ localparam logic [31:0] FLIP_MASK_L [4] =
+ '{32'h2200_1100, 32'h0044_0000, 32'h4411_0000, 32'h1100_0000};
+ localparam logic [31:0] FLIP_MASK_R [4] =
+ '{32'h0088_0044, 32'h0000_2200, 32'h0000_8822, 32'h0000_0088};
+
+ logic [31:0] SHUFFLE_MASK_NOT [4];
+ for(genvar i = 0; i < 4; i++) begin : gen_shuffle_mask_not
+ assign SHUFFLE_MASK_NOT[i] = ~(SHUFFLE_MASK_L[i] | SHUFFLE_MASK_R[i]);
+ end
+
+ logic shuffle_flip;
+ assign shuffle_flip = operator_i == ALU_UNSHFL;
+
+ logic [3:0] shuffle_mode;
+
+ always_comb begin
+ shuffle_result = operand_a_i;
+
+ if (shuffle_flip) begin
+ shuffle_mode[3] = shift_amt[0];
+ shuffle_mode[2] = shift_amt[1];
+ shuffle_mode[1] = shift_amt[2];
+ shuffle_mode[0] = shift_amt[3];
+ end else begin
+ shuffle_mode = shift_amt[3:0];
+ end
+
+ if (shuffle_flip) begin
+ shuffle_result = (shuffle_result & 32'h8822_4411) |
+ ((shuffle_result << 6) & FLIP_MASK_L[0]) | ((shuffle_result >> 6) & FLIP_MASK_R[0]) |
+ ((shuffle_result << 9) & FLIP_MASK_L[1]) | ((shuffle_result >> 9) & FLIP_MASK_R[1]) |
+ ((shuffle_result << 15) & FLIP_MASK_L[2]) | ((shuffle_result >> 15) & FLIP_MASK_R[2]) |
+ ((shuffle_result << 21) & FLIP_MASK_L[3]) | ((shuffle_result >> 21) & FLIP_MASK_R[3]);
+ end
+
+ if (shuffle_mode[3]) begin
+ shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[0]) |
+ (((shuffle_result << 8) & SHUFFLE_MASK_L[0]) |
+ ((shuffle_result >> 8) & SHUFFLE_MASK_R[0]));
+ end
+ if (shuffle_mode[2]) begin
+ shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[1]) |
+ (((shuffle_result << 4) & SHUFFLE_MASK_L[1]) |
+ ((shuffle_result >> 4) & SHUFFLE_MASK_R[1]));
+ end
+ if (shuffle_mode[1]) begin
+ shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[2]) |
+ (((shuffle_result << 2) & SHUFFLE_MASK_L[2]) |
+ ((shuffle_result >> 2) & SHUFFLE_MASK_R[2]));
+ end
+ if (shuffle_mode[0]) begin
+ shuffle_result = (shuffle_result & SHUFFLE_MASK_NOT[3]) |
+ (((shuffle_result << 1) & SHUFFLE_MASK_L[3]) |
+ ((shuffle_result >> 1) & SHUFFLE_MASK_R[3]));
+ end
+
+ if (shuffle_flip) begin
+ shuffle_result = (shuffle_result & 32'h8822_4411) |
+ ((shuffle_result << 6) & FLIP_MASK_L[0]) | ((shuffle_result >> 6) & FLIP_MASK_R[0]) |
+ ((shuffle_result << 9) & FLIP_MASK_L[1]) | ((shuffle_result >> 9) & FLIP_MASK_R[1]) |
+ ((shuffle_result << 15) & FLIP_MASK_L[2]) | ((shuffle_result >> 15) & FLIP_MASK_R[2]) |
+ ((shuffle_result << 21) & FLIP_MASK_L[3]) | ((shuffle_result >> 21) & FLIP_MASK_R[3]);
+ end
+ end
+
+ ///////////////
+ // Butterfly //
+ ///////////////
+
+ // The butterfly / inverse butterfly network executing bext/bdep (zbe) instructions.
+ // For bdep, the control bits mask of a local left region is generated by
+ // the inverse of a n-bit left rotate and complement upon wrap (LROTC) operation by the number
+ // of ones in the deposit bitmask to the right of the segment. n hereby denotes the width
+ // of the according segment. The bitmask for a pertaining local right region is equal to the
+ // corresponding local left region. Bext uses an analogue inverse process.
+ // Consider the following 8-bit example. For details, see Hilewitz et al. "Fast Bit Gather,
+ // Bit Scatter and Bit Permuation Instructions for Commodity Microprocessors", (2008).
+ //
+ // The bext/bdep instructions are completed in 2 cycles. In the first cycle, the control
+ // bitmask is prepared by executing the parallel prefix bit count. In the second cycle,
+ // the bit swapping is executed according to the control masks.
+
+ // 8-bit example: (Hilewitz et al.)
+ // Consider the instruction bdep operand_a_i deposit_mask
+ // Let operand_a_i = 8'babcd_efgh
+ // deposit_mask = 8'b1010_1101
+ //
+ // control bitmask for stage 1:
+ // - number of ones in the right half of the deposit bitmask: 3
+ // - width of the segment: 4
+ // - control bitmask = ~LROTC(4'b0, 3)[3:0] = 4'b1000
+ //
+ // control bitmask: c3 c2 c1 c0 c3 c2 c1 c0
+ // 1 0 0 0 1 0 0 0
+ // <- L -----> <- R ----->
+ // operand_a_i a b c d e f g h
+ // :\ | | | /: | | |
+ // : +|---|--|-+ : | | |
+ // :/ | | | \: | | |
+ // stage 1 e b c d a f g h
+ // <L-> <R-> <L-> <R->
+ // control bitmask: c3 c2 c3 c2 c1 c0 c1 c0
+ // 1 1 1 1 1 0 1 0
+ // :\ :\ /: /: :\ | /: |
+ // : +:-+-:+ : : +|-+ : |
+ // :/ :/ \: \: :/ | \: |
+ // stage 2 c d e b g f a h
+ // L R L R L R L R
+ // control bitmask: c3 c3 c2 c2 c1 c1 c0 c0
+ // 1 1 0 0 1 1 0 0
+ // :\/: | | :\/: | |
+ // : : | | : : | |
+ // :/\: | | :/\: | |
+ // stage 3 d c e b f g a h
+ // & deposit bitmask: 1 0 1 0 1 1 0 1
+ // result: d 0 e 0 f g 0 h
+
+ logic [ 5:0] bitcnt_partial_q [32];
+
+ // first cycle
+ // Store partial bitcnts
+ for (genvar i=0; i<32; i++) begin : gen_bitcnt_reg_in_lsb
+ assign bitcnt_partial_lsb_d[i] = bitcnt_partial[i][0];
+ end
+
+ for (genvar i=0; i<16; i++) begin : gen_bitcnt_reg_in_b1
+ assign bitcnt_partial_msb_d[i] = bitcnt_partial[2*i+1][1];
+ end
+
+ for (genvar i=0; i<8; i++) begin : gen_bitcnt_reg_in_b2
+ assign bitcnt_partial_msb_d[16+i] = bitcnt_partial[4*i+3][2];
+ end
+
+ for (genvar i=0; i<4; i++) begin : gen_bitcnt_reg_in_b3
+ assign bitcnt_partial_msb_d[24+i] = bitcnt_partial[8*i+7][3];
+ end
+
+ for (genvar i=0; i<2; i++) begin : gen_bitcnt_reg_in_b4
+ assign bitcnt_partial_msb_d[28+i] = bitcnt_partial[16*i+15][4];
+ end
+
+ assign bitcnt_partial_msb_d[30] = bitcnt_partial[31][5];
+ assign bitcnt_partial_msb_d[31] = 1'b0; // unused
+
+ // Second cycle
+ // Load partial bitcnts
+ always_comb begin
+ bitcnt_partial_q = '{default: '0};
+
+ for (int unsigned i=0; i<32; i++) begin : gen_bitcnt_reg_out_lsb
+ bitcnt_partial_q[i][0] = imd_val_q_i[0][i];
+ end
+
+ for (int unsigned i=0; i<16; i++) begin : gen_bitcnt_reg_out_b1
+ bitcnt_partial_q[2*i+1][1] = imd_val_q_i[1][i];
+ end
+
+ for (int unsigned i=0; i<8; i++) begin : gen_bitcnt_reg_out_b2
+ bitcnt_partial_q[4*i+3][2] = imd_val_q_i[1][16+i];
+ end
+
+ for (int unsigned i=0; i<4; i++) begin : gen_bitcnt_reg_out_b3
+ bitcnt_partial_q[8*i+7][3] = imd_val_q_i[1][24+i];
+ end
+
+ for (int unsigned i=0; i<2; i++) begin : gen_bitcnt_reg_out_b4
+ bitcnt_partial_q[16*i+15][4] = imd_val_q_i[1][28+i];
+ end
+
+ bitcnt_partial_q[31][5] = imd_val_q_i[1][30];
+ end
+
+ logic [31:0] butterfly_mask_l[5];
+ logic [31:0] butterfly_mask_r[5];
+ logic [31:0] butterfly_mask_not[5];
+ logic [31:0] lrotc_stage [5]; // left rotate and complement upon wrap
+
+ // number of bits in local r = 32 / 2**(stage + 1) = 16/2**stage
+ `define _N(stg) (16 >> stg)
+
+ // bext / bdep control bit generation
+ for (genvar stg=0; stg<5; stg++) begin : gen_butterfly_ctrl_stage
+ // number of segs: 2** stg
+ for (genvar seg=0; seg<2**stg; seg++) begin : gen_butterfly_ctrl
+
+ assign lrotc_stage[stg][2*`_N(stg)*(seg+1)-1 : 2*`_N(stg)*seg] =
+ {{`_N(stg){1'b0}},{`_N(stg){1'b1}}} <<
+ bitcnt_partial_q[`_N(stg)*(2*seg+1)-1][$clog2(`_N(stg)):0];
+
+ assign butterfly_mask_l[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)]
+ = ~lrotc_stage[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)];
+
+ assign butterfly_mask_r[stg][`_N(stg)*(2*seg+1)-1 : `_N(stg)*(2*seg)]
+ = ~lrotc_stage[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)];
+
+ assign butterfly_mask_l[stg][`_N(stg)*(2*seg+1)-1 : `_N(stg)*(2*seg)] = '0;
+ assign butterfly_mask_r[stg][`_N(stg)*(2*seg+2)-1 : `_N(stg)*(2*seg+1)] = '0;
+ end
+ end
+ `undef _N
+
+ for (genvar stg=0; stg<5; stg++) begin : gen_butterfly_not
+ assign butterfly_mask_not[stg] =
+ ~(butterfly_mask_l[stg] | butterfly_mask_r[stg]);
+ end
+
+ always_comb begin
+ butterfly_result = operand_a_i;
+
+ butterfly_result = butterfly_result & butterfly_mask_not[0] |
+ ((butterfly_result & butterfly_mask_l[0]) >> 16)|
+ ((butterfly_result & butterfly_mask_r[0]) << 16);
+
+ butterfly_result = butterfly_result & butterfly_mask_not[1] |
+ ((butterfly_result & butterfly_mask_l[1]) >> 8)|
+ ((butterfly_result & butterfly_mask_r[1]) << 8);
+
+ butterfly_result = butterfly_result & butterfly_mask_not[2] |
+ ((butterfly_result & butterfly_mask_l[2]) >> 4)|
+ ((butterfly_result & butterfly_mask_r[2]) << 4);
+
+ butterfly_result = butterfly_result & butterfly_mask_not[3] |
+ ((butterfly_result & butterfly_mask_l[3]) >> 2)|
+ ((butterfly_result & butterfly_mask_r[3]) << 2);
+
+ butterfly_result = butterfly_result & butterfly_mask_not[4] |
+ ((butterfly_result & butterfly_mask_l[4]) >> 1)|
+ ((butterfly_result & butterfly_mask_r[4]) << 1);
+
+ butterfly_result = butterfly_result & operand_b_i;
+ end
+
+ always_comb begin
+ invbutterfly_result = operand_a_i & operand_b_i;
+
+ invbutterfly_result = invbutterfly_result & butterfly_mask_not[4] |
+ ((invbutterfly_result & butterfly_mask_l[4]) >> 1)|
+ ((invbutterfly_result & butterfly_mask_r[4]) << 1);
+
+ invbutterfly_result = invbutterfly_result & butterfly_mask_not[3] |
+ ((invbutterfly_result & butterfly_mask_l[3]) >> 2)|
+ ((invbutterfly_result & butterfly_mask_r[3]) << 2);
+
+ invbutterfly_result = invbutterfly_result & butterfly_mask_not[2] |
+ ((invbutterfly_result & butterfly_mask_l[2]) >> 4)|
+ ((invbutterfly_result & butterfly_mask_r[2]) << 4);
+
+ invbutterfly_result = invbutterfly_result & butterfly_mask_not[1] |
+ ((invbutterfly_result & butterfly_mask_l[1]) >> 8)|
+ ((invbutterfly_result & butterfly_mask_r[1]) << 8);
+
+ invbutterfly_result = invbutterfly_result & butterfly_mask_not[0] |
+ ((invbutterfly_result & butterfly_mask_l[0]) >> 16)|
+ ((invbutterfly_result & butterfly_mask_r[0]) << 16);
+ end
+
+ ///////////////////////////////////////////////////
+ // Carry-less Multiply + Cyclic Redundancy Check //
+ ///////////////////////////////////////////////////
+
+ // Carry-less multiplication can be understood as multiplication based on
+ // the addition interpreted as the bit-wise xor operation.
+ //
+ // Example: 1101 X 1011 = 1111111:
+ //
+ // 1011 X 1101
+ // -----------
+ // 1101
+ // xor 1101
+ // ---------
+ // 10111
+ // xor 0000
+ // ----------
+ // 010111
+ // xor 1101
+ // -----------
+ // 1111111
+ //
+ // Architectural details:
+ // A 32 x 32-bit array
+ // [ operand_b[i] ? (operand_a << i) : '0 for i in 0 ... 31 ]
+ // is generated. The entries of the array are pairwise 'xor-ed'
+ // together in a 5-stage binary tree.
+ //
+ //
+ // Cyclic Redundancy Check:
+ //
+ // CRC-32 (CRC-32/ISO-HDLC) and CRC-32C (CRC-32/ISCSI) are directly implemented. For
+ // documentation of the crc configuration (crc-polynomials, initialization, reflection, etc.)
+ // see http://reveng.sourceforge.net/crc-catalogue/all.htm
+ // A useful guide to crc arithmetic and algorithms is given here:
+ // http://www.piclist.com/techref/method/math/crcguide.html.
+ //
+ // The CRC operation solves the following equation using binary polynomial arithmetic:
+ //
+ // rev(rd)(x) = rev(rs1)(x) * x**n mod {1, P}(x)
+ //
+ // where P denotes lower 32 bits of the corresponding CRC polynomial, rev(a) the bit reversal
+ // of a, n = 8,16, or 32 for .b, .h, .w -variants. {a, b} denotes bit concatenation.
+ //
+ // Using barret reduction, one can show that
+ //
+ // M(x) mod P(x) = R(x) =
+ // (M(x) * x**n) & {deg(P(x)'{1'b1}}) ^ (M(x) x**-(deg(P(x) - n)) cx mu(x) cx P(x),
+ //
+ // Where mu(x) = polydiv(x**64, {1,P}) & 0xffffffff. Here, 'cx' refers to carry-less
+ // multiplication. Substituting rev(rd)(x) for R(x) and rev(rs1)(x) for M(x) and solving for
+ // rd(x) with P(x) a crc32 polynomial (deg(P(x)) = 32), we get
+ //
+ // rd = rev( (rev(rs1) << n) ^ ((rev(rs1) >> (32-n)) cx mu cx P)
+ // = (rs1 >> n) ^ rev(rev( (rs1 << (32-n)) cx rev(mu)) cx P)
+ // ^-- cycle 0--------------------^
+ // ^- cycle 1 -------------------------------------------^
+ //
+ // In the last step we used the fact that carry-less multiplication is bit-order agnostic:
+ // rev(a cx b) = rev(a) cx rev(b).
+
+ logic clmul_rmode;
+ logic clmul_hmode;
+ logic [31:0] clmul_op_a;
+ logic [31:0] clmul_op_b;
+ logic [31:0] operand_b_rev;
+ logic [31:0] clmul_and_stage[32];
+ logic [31:0] clmul_xor_stage1[16];
+ logic [31:0] clmul_xor_stage2[8];
+ logic [31:0] clmul_xor_stage3[4];
+ logic [31:0] clmul_xor_stage4[2];
+
+ logic [31:0] clmul_result_raw;
+
+ for (genvar i=0; i<32; i++) begin: gen_rev_operand_b
+ assign operand_b_rev[i] = operand_b_i[31-i];
+ end
+
+ assign clmul_rmode = operator_i == ALU_CLMULR;
+ assign clmul_hmode = operator_i == ALU_CLMULH;
+
+ // CRC
+ localparam logic [31:0] CRC32_POLYNOMIAL = 32'h04c1_1db7;
+ localparam logic [31:0] CRC32_MU_REV = 32'hf701_1641;
+
+ localparam logic [31:0] CRC32C_POLYNOMIAL = 32'h1edc_6f41;
+ localparam logic [31:0] CRC32C_MU_REV = 32'hdea7_13f1;
+
+ logic crc_op;
+
+ logic crc_cpoly;
+
+ logic [31:0] crc_operand;
+ logic [31:0] crc_poly;
+ logic [31:0] crc_mu_rev;
+
+ assign crc_op = (operator_i == ALU_CRC32C_W) | (operator_i == ALU_CRC32_W) |
+ (operator_i == ALU_CRC32C_H) | (operator_i == ALU_CRC32_H) |
+ (operator_i == ALU_CRC32C_B) | (operator_i == ALU_CRC32_B);
+
+ assign crc_cpoly = (operator_i == ALU_CRC32C_W) |
+ (operator_i == ALU_CRC32C_H) |
+ (operator_i == ALU_CRC32C_B);
+
+ assign crc_hmode = (operator_i == ALU_CRC32_H) | (operator_i == ALU_CRC32C_H);
+ assign crc_bmode = (operator_i == ALU_CRC32_B) | (operator_i == ALU_CRC32C_B);
+
+ assign crc_poly = crc_cpoly ? CRC32C_POLYNOMIAL : CRC32_POLYNOMIAL;
+ assign crc_mu_rev = crc_cpoly ? CRC32C_MU_REV : CRC32_MU_REV;
+
+ always_comb begin
+ unique case(1'b1)
+ crc_bmode: crc_operand = {operand_a_i[7:0], 24'h0};
+ crc_hmode: crc_operand = {operand_a_i[15:0], 16'h0};
+ default: crc_operand = operand_a_i;
+ endcase
+ end
+
+ // Select clmul input
+ always_comb begin
+ if (crc_op) begin
+ clmul_op_a = instr_first_cycle_i ? crc_operand : imd_val_q_i[0];
+ clmul_op_b = instr_first_cycle_i ? crc_mu_rev : crc_poly;
+ end else begin
+ clmul_op_a = clmul_rmode | clmul_hmode ? operand_a_rev : operand_a_i;
+ clmul_op_b = clmul_rmode | clmul_hmode ? operand_b_rev : operand_b_i;
+ end
+ end
+
+ for (genvar i=0; i<32; i++) begin : gen_clmul_and_op
+ assign clmul_and_stage[i] = clmul_op_b[i] ? clmul_op_a << i : '0;
+ end
+
+ for (genvar i=0; i<16; i++) begin : gen_clmul_xor_op_l1
+ assign clmul_xor_stage1[i] = clmul_and_stage[2*i] ^ clmul_and_stage[2*i+1];
+ end
+
+ for (genvar i=0; i<8; i++) begin : gen_clmul_xor_op_l2
+ assign clmul_xor_stage2[i] = clmul_xor_stage1[2*i] ^ clmul_xor_stage1[2*i+1];
+ end
+
+ for (genvar i=0; i<4; i++) begin : gen_clmul_xor_op_l3
+ assign clmul_xor_stage3[i] = clmul_xor_stage2[2*i] ^ clmul_xor_stage2[2*i+1];
+ end
+
+ for (genvar i=0; i<2; i++) begin : gen_clmul_xor_op_l4
+ assign clmul_xor_stage4[i] = clmul_xor_stage3[2*i] ^ clmul_xor_stage3[2*i+1];
+ end
+
+ assign clmul_result_raw = clmul_xor_stage4[0] ^ clmul_xor_stage4[1];
+
+ for (genvar i=0; i<32; i++) begin : gen_rev_clmul_result
+ assign clmul_result_rev[i] = clmul_result_raw[31-i];
+ end
+
+ // clmulr_result = rev(clmul(rev(a), rev(b)))
+ // clmulh_result = clmulr_result >> 1
+ always_comb begin
+ case(1'b1)
+ clmul_rmode: clmul_result = clmul_result_rev;
+ clmul_hmode: clmul_result = {1'b0, clmul_result_rev[31:1]};
+ default: clmul_result = clmul_result_raw;
+ endcase
+ end
+ end else begin : gen_alu_rvb_notfull
+ assign shuffle_result = '0;
+ assign butterfly_result = '0;
+ assign invbutterfly_result = '0;
+ assign clmul_result = '0;
+ // support signals
+ assign bitcnt_partial_lsb_d = '0;
+ assign bitcnt_partial_msb_d = '0;
+ assign clmul_result_rev = '0;
+ assign crc_bmode = '0;
+ assign crc_hmode = '0;
+ end
+
+ //////////////////////////////////////
+ // Multicycle Bitmanip Instructions //
+ //////////////////////////////////////
+ // Ternary instructions + Shift Rotations + Bit extract/deposit + CRC
+ // For ternary instructions (zbt), operand_a_i is tied to rs1 in the first cycle and rs3 in the
+ // second cycle. operand_b_i is always tied to rs2.
+
+ always_comb begin
+ unique case (operator_i)
+ ALU_CMOV: begin
+ multicycle_result = (operand_b_i == 32'h0) ? operand_a_i : imd_val_q_i[0];
+ imd_val_d_o = '{operand_a_i, 32'h0};
+ if (instr_first_cycle_i) begin
+ imd_val_we_o = 2'b01;
+ end else begin
+ imd_val_we_o = 2'b00;
+ end
+ end
+
+ ALU_CMIX: begin
+ multicycle_result = imd_val_q_i[0] | bwlogic_and_result;
+ imd_val_d_o = '{bwlogic_and_result, 32'h0};
+ if (instr_first_cycle_i) begin
+ imd_val_we_o = 2'b01;
+ end else begin
+ imd_val_we_o = 2'b00;
+ end
+ end
+
+ ALU_FSR, ALU_FSL,
+ ALU_ROL, ALU_ROR: begin
+ if (shift_amt[4:0] == 5'h0) begin
+ multicycle_result = shift_amt[5] ? operand_a_i : imd_val_q_i[0];
+ end else begin
+ multicycle_result = imd_val_q_i[0] | shift_result;
+ end
+ imd_val_d_o = '{shift_result, 32'h0};
+ if (instr_first_cycle_i) begin
+ imd_val_we_o = 2'b01;
+ end else begin
+ imd_val_we_o = 2'b00;
+ end
+ end
+
+ ALU_CRC32_W, ALU_CRC32C_W,
+ ALU_CRC32_H, ALU_CRC32C_H,
+ ALU_CRC32_B, ALU_CRC32C_B: begin
+ if (RV32B == RV32BFull) begin
+ unique case(1'b1)
+ crc_bmode: multicycle_result = clmul_result_rev ^ (operand_a_i >> 8);
+ crc_hmode: multicycle_result = clmul_result_rev ^ (operand_a_i >> 16);
+ default: multicycle_result = clmul_result_rev;
+ endcase
+ imd_val_d_o = '{clmul_result_rev, 32'h0};
+ if (instr_first_cycle_i) begin
+ imd_val_we_o = 2'b01;
+ end else begin
+ imd_val_we_o = 2'b00;
+ end
+ end else begin
+ imd_val_d_o = '{operand_a_i, 32'h0};
+ imd_val_we_o = 2'b00;
+ multicycle_result = '0;
+ end
+ end
+
+ ALU_BEXT, ALU_BDEP: begin
+ if (RV32B == RV32BFull) begin
+ multicycle_result = (operator_i == ALU_BDEP) ? butterfly_result : invbutterfly_result;
+ imd_val_d_o = '{bitcnt_partial_lsb_d, bitcnt_partial_msb_d};
+ if (instr_first_cycle_i) begin
+ imd_val_we_o = 2'b11;
+ end else begin
+ imd_val_we_o = 2'b00;
+ end
+ end else begin
+ imd_val_d_o = '{operand_a_i, 32'h0};
+ imd_val_we_o = 2'b00;
+ multicycle_result = '0;
+ end
+ end
+
+ default: begin
+ imd_val_d_o = '{operand_a_i, 32'h0};
+ imd_val_we_o = 2'b00;
+ multicycle_result = '0;
+ end
+ endcase
+ end
+
+
end else begin : g_no_alu_rvb
// RV32B result signals
- assign minmax_result = '0;
assign bitcnt_result = '0;
+ assign minmax_result = '0;
assign pack_result = '0;
assign sext_result = '0;
- assign multicycle_result = '0;
assign singlebit_result = '0;
+ assign rev_result = '0;
assign shuffle_result = '0;
assign butterfly_result = '0;
assign invbutterfly_result = '0;
assign clmul_result = '0;
+ assign multicycle_result = '0;
// RV32B support signals
- assign imd_val_d_o = '0;
- assign imd_val_we_o = '0;
+ assign imd_val_d_o = '{default: '0};
+ assign imd_val_we_o = '{default: '0};
end
////////////////
@@ -1130,18 +1223,16 @@
// Cyclic Redundancy Checks (RV32B)
ALU_CRC32_W, ALU_CRC32C_W,
ALU_CRC32_H, ALU_CRC32C_H,
- ALU_CRC32_B, ALU_CRC32C_B: result_o = multicycle_result;
+ ALU_CRC32_B, ALU_CRC32C_B,
+ // Bit Extract / Deposit (RV32B)
+ ALU_BEXT, ALU_BDEP: result_o = multicycle_result;
// Single-Bit Bitmanip Operations (RV32B)
ALU_SBSET, ALU_SBCLR,
ALU_SBINV, ALU_SBEXT: result_o = singlebit_result;
- // Bit Extract / Deposit (RV32B)
- ALU_BDEP: result_o = butterfly_result;
- ALU_BEXT: result_o = invbutterfly_result;
-
// General Reverse / Or-combine (RV32B)
- ALU_GREV, ALU_GORC: result_o = butterfly_result;
+ ALU_GREV, ALU_GORC: result_o = rev_result;
// Bit Field Place (RV32B)
ALU_BFP: result_o = bfp_result;
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_core.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_core.sv
index 6fd1b40..6146993 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_core.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_core.sv
@@ -9,27 +9,31 @@
`include "prim_assert.sv"
+`ifndef RV32B
+ `define RV32B ibex_pkg::RV32BNone
+`endif
+
/**
* Top level module of the ibex RISC-V core
*/
module ibex_core #(
- parameter bit PMPEnable = 1'b0,
- parameter int unsigned PMPGranularity = 0,
- parameter int unsigned PMPNumRegions = 4,
- parameter int unsigned MHPMCounterNum = 0,
- parameter int unsigned MHPMCounterWidth = 40,
- parameter bit RV32E = 1'b0,
- parameter bit RV32M = 1'b1,
- parameter bit RV32B = 1'b0,
- parameter bit BranchTargetALU = 1'b0,
- parameter bit WritebackStage = 1'b0,
- parameter MultiplierImplementation = "fast",
- parameter bit ICache = 1'b0,
- parameter bit ICacheECC = 1'b0,
- parameter bit DbgTriggerEn = 1'b0,
- parameter bit SecureIbex = 1'b0,
- parameter int unsigned DmHaltAddr = 32'h1A110800,
- parameter int unsigned DmExceptionAddr = 32'h1A110808
+ parameter bit PMPEnable = 1'b0,
+ parameter int unsigned PMPGranularity = 0,
+ parameter int unsigned PMPNumRegions = 4,
+ parameter int unsigned MHPMCounterNum = 0,
+ parameter int unsigned MHPMCounterWidth = 40,
+ parameter bit RV32E = 1'b0,
+ parameter bit RV32M = 1'b1,
+ parameter ibex_pkg::rv32b_e RV32B = `RV32B,
+ parameter bit BranchTargetALU = 1'b0,
+ parameter bit WritebackStage = 1'b0,
+ parameter MultiplierImplementation = "fast",
+ parameter bit ICache = 1'b0,
+ parameter bit ICacheECC = 1'b0,
+ parameter bit DbgTriggerEn = 1'b0,
+ parameter bit SecureIbex = 1'b0,
+ parameter int unsigned DmHaltAddr = 32'h1A110800,
+ parameter int unsigned DmExceptionAddr = 32'h1A110808
) (
// Clock and Reset
input logic clk_i,
@@ -110,7 +114,7 @@
localparam bit DummyInstructions = SecureIbex;
// Speculative branch option, trades-off performance against timing.
// Setting this to 1 eases branch target critical paths significantly but reduces performance
- // by ~3% (based on Coremark/MHz score).
+ // by ~3% (based on CoreMark/MHz score).
// Set by default in the max PMP config which has the tightest budget for branch target timing.
localparam bit SpecBranch = PMPEnable & (PMPNumRegions == 16);
@@ -119,8 +123,8 @@
logic instr_valid_id;
logic instr_new_id;
logic [31:0] instr_rdata_id; // Instruction sampled inside IF stage
- logic [31:0] instr_rdata_alu_id; // Instruction sampled inside IF stage (replicated to ease
- // fan-out)
+ logic [31:0] instr_rdata_alu_id; // Instruction sampled inside IF stage (replicated to
+ // ease fan-out)
logic [15:0] instr_rdata_c_id; // Compressed instruction sampled inside IF stage
logic instr_is_compressed_id;
logic instr_fetch_err; // Bus error on instr fetch
@@ -129,9 +133,9 @@
logic [31:0] pc_if; // Program counter in IF stage
logic [31:0] pc_id; // Program counter in ID stage
logic [31:0] pc_wb; // Program counter in WB stage
- logic [33:0] imd_val_d_ex; // Intermediate register for multicycle Ops
- logic [33:0] imd_val_q_ex; // Intermediate register for multicycle Ops
- logic imd_val_we_ex;
+ logic [33:0] imd_val_d_ex[2]; // Intermediate register for multicycle Ops
+ logic [33:0] imd_val_q_ex[2]; // Intermediate register for multicycle Ops
+ logic [1:0] imd_val_we_ex;
logic data_ind_timing;
logic dummy_instr_en;
@@ -775,8 +779,10 @@
logic outstanding_load_id;
logic outstanding_store_id;
- assign outstanding_load_id = id_stage_i.instr_executing & id_stage_i.lsu_req_dec & ~id_stage_i.lsu_we;
- assign outstanding_store_id = id_stage_i.instr_executing & id_stage_i.lsu_req_dec & id_stage_i.lsu_we;
+ assign outstanding_load_id = id_stage_i.instr_executing & id_stage_i.lsu_req_dec &
+ ~id_stage_i.lsu_we;
+ assign outstanding_store_id = id_stage_i.instr_executing & id_stage_i.lsu_req_dec &
+ id_stage_i.lsu_we;
if (WritebackStage) begin : gen_wb_stage
// When the writeback stage is present a load/store could be in ID or WB. A Load/store in ID can
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_core_tracing.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_core_tracing.sv
index 1c019d5..e290c35 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_core_tracing.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_core_tracing.sv
@@ -2,28 +2,32 @@
// Licensed under the Apache License, Version 2.0, see LICENSE for details.
// SPDX-License-Identifier: Apache-2.0
+`ifndef RV32B
+ `define RV32B ibex_pkg::RV32BNone
+`endif
/**
* Top level module of the ibex RISC-V core with tracing enabled
*/
+
module ibex_core_tracing #(
- parameter bit PMPEnable = 1'b0,
- parameter int unsigned PMPGranularity = 0,
- parameter int unsigned PMPNumRegions = 4,
- parameter int unsigned MHPMCounterNum = 0,
- parameter int unsigned MHPMCounterWidth = 40,
- parameter bit RV32E = 1'b0,
- parameter bit RV32M = 1'b1,
- parameter bit RV32B = 1'b0,
- parameter bit BranchTargetALU = 1'b0,
- parameter bit WritebackStage = 1'b0,
- parameter MultiplierImplementation = "fast",
- parameter bit ICache = 1'b0,
- parameter bit ICacheECC = 1'b0,
- parameter bit DbgTriggerEn = 1'b0,
- parameter bit SecureIbex = 1'b0,
- parameter int unsigned DmHaltAddr = 32'h1A110800,
- parameter int unsigned DmExceptionAddr = 32'h1A110808
+ parameter bit PMPEnable = 1'b0,
+ parameter int unsigned PMPGranularity = 0,
+ parameter int unsigned PMPNumRegions = 4,
+ parameter int unsigned MHPMCounterNum = 0,
+ parameter int unsigned MHPMCounterWidth = 40,
+ parameter bit RV32E = 1'b0,
+ parameter bit RV32M = 1'b1,
+ parameter ibex_pkg::rv32b_e RV32B = `RV32B,
+ parameter bit BranchTargetALU = 1'b0,
+ parameter bit WritebackStage = 1'b0,
+ parameter MultiplierImplementation = "fast",
+ parameter bit ICache = 1'b0,
+ parameter bit ICacheECC = 1'b0,
+ parameter bit DbgTriggerEn = 1'b0,
+ parameter bit SecureIbex = 1'b0,
+ parameter int unsigned DmHaltAddr = 32'h1A110800,
+ parameter int unsigned DmExceptionAddr = 32'h1A110808
) (
// Clock and Reset
input logic clk_i,
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_counter.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_counter.sv
index 465b931..0091d5a 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_counter.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_counter.sv
@@ -56,7 +56,7 @@
`endif
// Counter flop
- always @(`COUNTER_FLOP_RST) begin
+ always_ff @(`COUNTER_FLOP_RST) begin
if (!rst_ni) begin
counter_q <= '0;
end else begin
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_cs_registers.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_cs_registers.sv
index 0b6f934..6e5eaf4 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_cs_registers.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_cs_registers.sv
@@ -862,7 +862,8 @@
// update enable signals
always_comb begin : mcountinhibit_update
if (mcountinhibit_we == 1'b1) begin
- mcountinhibit_d = {csr_wdata_int[MHPMCounterNum+2:2], 1'b0, csr_wdata_int[0]}; // bit 1 must always be 0
+ // bit 1 must always be 0
+ mcountinhibit_d = {csr_wdata_int[MHPMCounterNum+2:2], 1'b0, csr_wdata_int[0]};
end else begin
mcountinhibit_d = mcountinhibit_q;
end
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_decoder.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_decoder.sv
index b895275..3b2807e 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_decoder.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_decoder.sv
@@ -14,10 +14,10 @@
`include "prim_assert.sv"
module ibex_decoder #(
- parameter bit RV32E = 0,
- parameter bit RV32M = 1,
- parameter bit RV32B = 0,
- parameter bit BranchTargetALU = 0
+ parameter bit RV32E = 0,
+ parameter bit RV32M = 1,
+ parameter bit BranchTargetALU = 0,
+ parameter ibex_pkg::rv32b_e RV32B = ibex_pkg::RV32BNone
) (
input logic clk_i,
input logic rst_ni,
@@ -112,7 +112,8 @@
logic [4:0] instr_rs3;
logic [4:0] instr_rd;
- logic use_rs3;
+ logic use_rs3_d;
+ logic use_rs3_q;
csr_op_e csr_op;
@@ -139,11 +140,20 @@
// immediate for CSR manipulation (zero extended)
assign zimm_rs1_type_o = { 27'b0, instr_rs1 }; // rs1
+ // the use of rs3 is known one cycle ahead.
+ always_ff @(posedge clk_i or negedge rst_ni) begin
+ if (!rst_ni) begin
+ use_rs3_q <= 1'b0;
+ end else begin
+ use_rs3_q <= use_rs3_d;
+ end
+ end
+
// source registers
assign instr_rs1 = instr[19:15];
assign instr_rs2 = instr[24:20];
assign instr_rs3 = instr[31:27];
- assign rf_raddr_a_o = use_rs3 ? instr_rs3 : instr_rs1; // rs3 / rs1
+ assign rf_raddr_a_o = (use_rs3_q & ~instr_first_cycle_i) ? instr_rs3 : instr_rs1; // rs3 / rs1
assign rf_raddr_b_o = instr_rs2; // rs2
// destination register
@@ -338,29 +348,29 @@
3'b001: begin
unique case (instr[31:27])
- 5'b0_0000: illegal_insn = 1'b0; // slli
- 5'b0_0100, // sloi
- 5'b0_1001, // sbclri
- 5'b0_0101, // sbseti
- 5'b0_1101: illegal_insn = RV32B ? 1'b0 : 1'b1; // sbinvi
+ 5'b0_0000: illegal_insn = 1'b0; // slli
+ 5'b0_0100, // sloi
+ 5'b0_1001, // sbclri
+ 5'b0_0101, // sbseti
+ 5'b0_1101: illegal_insn = (RV32B != RV32BNone) ? 1'b0 : 1'b1; // sbinvi
5'b0_0001: if (instr[26] == 1'b0) begin
- illegal_insn = RV32B ? 1'b0 : 1'b1; // shfl
+ illegal_insn = (RV32B == RV32BFull) ? 1'b0 : 1'b1; // shfl
end else begin
illegal_insn = 1'b1;
end
5'b0_1100: begin
unique case(instr[26:20])
- 7'b000_0000, // clz
- 7'b000_0001, // ctz
- 7'b000_0010, // pcnt
- 7'b000_0100, // sext.b
- 7'b000_0101, // sext.h
- 7'b001_0000, // crc32.b
- 7'b001_0001, // crc32.h
- 7'b001_0010, // crc32.w
- 7'b001_1000, // crc32c.b
- 7'b001_1001, // crc32c.h
- 7'b001_1010: illegal_insn = RV32B ? 1'b0 : 1'b1; // crc32c.w
+ 7'b000_0000, // clz
+ 7'b000_0001, // ctz
+ 7'b000_0010, // pcnt
+ 7'b000_0100, // sext.b
+ 7'b000_0101: illegal_insn = (RV32B != RV32BNone) ? 1'b0 : 1'b1; // sext.h
+ 7'b001_0000, // crc32.b
+ 7'b001_0001, // crc32.h
+ 7'b001_0010, // crc32.w
+ 7'b001_1000, // crc32c.b
+ 7'b001_1001, // crc32c.h
+ 7'b001_1010: illegal_insn = (RV32B == RV32BFull) ? 1'b0 : 1'b1; // crc32c.w
default: illegal_insn = 1'b1;
endcase
@@ -371,22 +381,41 @@
3'b101: begin
if (instr[26]) begin
- illegal_insn = RV32B ? 1'b0 : 1'b1; // fsri
+ illegal_insn = (RV32B != RV32BNone) ? 1'b0 : 1'b1; // fsri
end else begin
unique case (instr[31:27])
- 5'b0_0000, // srli
- 5'b0_1000: illegal_insn = 1'b0; // srai
+ 5'b0_0000, // srli
+ 5'b0_1000: illegal_insn = 1'b0; // srai
- 5'b0_0100, // sroi
- 5'b0_1100, // rori
- 5'b0_1001: illegal_insn = RV32B ? 1'b0 : 1'b1; // sbexti
+ 5'b0_0100, // sroi
+ 5'b0_1100, // rori
+ 5'b0_1001: illegal_insn = (RV32B != RV32BNone) ? 1'b0 : 1'b1; // sbexti
- 5'b0_1101, // grevi
- 5'b0_0101: illegal_insn = RV32B ? 1'b0 : 1'b1; // gorci
- 5'b0_0001: if (instr[26] == 1'b0) begin
- illegal_insn = RV32B ? 1'b0 : 1'b1; // unshfl
- end else begin
- illegal_insn = 1'b1;
+ 5'b0_1101: begin
+ if ((RV32B == RV32BFull)) begin
+ illegal_insn = 1'b0; // grevi
+ end else begin
+ unique case (instr[24:20])
+ 5'b11111, // rev
+ 5'b11000: illegal_insn = (RV32B == RV32BBalanced) ? 1'b0 : 1'b1; // rev8
+
+ default: illegal_insn = 1'b1;
+ endcase
+ end
+ end
+ 5'b0_0101: begin
+ if ((RV32B == RV32BFull)) begin
+ illegal_insn = 1'b0; // gorci
+ end else if (instr[24:20] == 5'b00111) begin
+ illegal_insn = (RV32B == RV32BBalanced) ? 1'b0 : 1'b1; // orc.b
+ end
+ end
+ 5'b0_0001: begin
+ if (instr[26] == 1'b0) begin
+ illegal_insn = (RV32B == RV32BFull) ? 1'b0 : 1'b1; // unshfl
+ end else begin
+ illegal_insn = 1'b1;
+ end
end
default: illegal_insn = 1'b1;
@@ -403,7 +432,7 @@
rf_ren_b_o = 1'b1;
rf_we = 1'b1;
if ({instr[26], instr[13:12]} == {1'b1, 2'b01}) begin
- illegal_insn = RV32B ? 1'b0 : 1'b1; // cmix / cmov / fsl / fsr
+ illegal_insn = (RV32B != RV32BNone) ? 1'b0 : 1'b1; // cmix / cmov / fsl / fsr
end else begin
unique case ({instr[31:25], instr[14:12]})
// RV32I ALU operations
@@ -438,6 +467,8 @@
{7'b001_0100, 3'b001}, // sbset
{7'b011_0100, 3'b001}, // sbinv
{7'b010_0100, 3'b101}, // sbext
+ // RV32B zbf
+ {7'b010_0100, 3'b111}: illegal_insn = (RV32B != RV32BNone) ? 1'b0 : 1'b1; // bfp
// RV32B zbe
{7'b010_0100, 3'b110}, // bdep
{7'b000_0100, 3'b110}, // bext
@@ -446,12 +477,10 @@
{7'b001_0100, 3'b101}, // gorc
{7'b000_0100, 3'b001}, // shfl
{7'b000_0100, 3'b101}, // unshfl
- // RV32B zbf
- {7'b010_0100, 3'b111}, // bfp
// RV32B zbc
{7'b000_0101, 3'b001}, // clmul
{7'b000_0101, 3'b010}, // clmulr
- {7'b000_0101, 3'b011}: illegal_insn = RV32B ? 1'b0 : 1'b1; // clmulh
+ {7'b000_0101, 3'b011}: illegal_insn = (RV32B == RV32BFull) ? 1'b0 : 1'b1; // clmulh
// RV32M instructions
{7'b000_0001, 3'b000}: begin // mul
@@ -627,7 +656,7 @@
opcode_alu = opcode_e'(instr_alu[6:0]);
- use_rs3 = 1'b0;
+ use_rs3_d = 1'b0;
alu_multicycle_o = 1'b0;
mult_sel_o = 1'b0;
div_sel_o = 1'b0;
@@ -774,7 +803,7 @@
3'b111: alu_operator_o = ALU_AND; // And with Immediate
3'b001: begin
- if (RV32B) begin
+ if (RV32B != RV32BNone) begin
unique case (instr_alu[31:27])
5'b0_0000: alu_operator_o = ALU_SLL; // Shift Left Logical by Immediate
5'b0_0100: alu_operator_o = ALU_SLO; // Shift Left Ones by Immediate
@@ -785,34 +814,46 @@
5'b0_0001: if (instr_alu[26] == 0) alu_operator_o = ALU_SHFL;
5'b0_1100: begin
unique case (instr_alu[26:20])
- 7'b000_0000: alu_operator_o = ALU_CLZ; // clz
- 7'b000_0001: alu_operator_o = ALU_CTZ; // ctz
- 7'b000_0010: alu_operator_o = ALU_PCNT; // pcnt
- 7'b000_0100: alu_operator_o = ALU_SEXTB; // sext.b
- 7'b000_0101: alu_operator_o = ALU_SEXTH; // sext.h
+ 7'b000_0000: alu_operator_o = ALU_CLZ; // clz
+ 7'b000_0001: alu_operator_o = ALU_CTZ; // ctz
+ 7'b000_0010: alu_operator_o = ALU_PCNT; // pcnt
+ 7'b000_0100: alu_operator_o = ALU_SEXTB; // sext.b
+ 7'b000_0101: alu_operator_o = ALU_SEXTH; // sext.h
7'b001_0000: begin
- alu_operator_o = ALU_CRC32_B; // crc32.b
- alu_multicycle_o = 1'b1;
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_CRC32_B; // crc32.b
+ alu_multicycle_o = 1'b1;
+ end
end
7'b001_0001: begin
- alu_operator_o = ALU_CRC32_H; // crc32.h
- alu_multicycle_o = 1'b1;
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_CRC32_H; // crc32.h
+ alu_multicycle_o = 1'b1;
+ end
end
7'b001_0010: begin
- alu_operator_o = ALU_CRC32_W; // crc32.w
- alu_multicycle_o = 1'b1;
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_CRC32_W; // crc32.w
+ alu_multicycle_o = 1'b1;
+ end
end
7'b001_1000: begin
- alu_operator_o = ALU_CRC32C_B; // crc32c.b
- alu_multicycle_o = 1'b1;
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_CRC32C_B; // crc32c.b
+ alu_multicycle_o = 1'b1;
+ end
end
7'b001_1001: begin
- alu_operator_o = ALU_CRC32C_H; // crc32c.h
- alu_multicycle_o = 1'b1;
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_CRC32C_H; // crc32c.h
+ alu_multicycle_o = 1'b1;
+ end
end
7'b001_1010: begin
- alu_operator_o = ALU_CRC32C_W; // crc32c.w
- alu_multicycle_o = 1'b1;
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_CRC32C_W; // crc32c.w
+ alu_multicycle_o = 1'b1;
+ end
end
default: ;
endcase
@@ -821,19 +862,19 @@
default: ;
endcase
end else begin
- alu_operator_o = ALU_SLL; // Shift Left Logical by Immediate
+ alu_operator_o = ALU_SLL; // Shift Left Logical by Immediate
end
end
3'b101: begin
- if (RV32B) begin
+ if (RV32B != RV32BNone) begin
if (instr_alu[26] == 1'b1) begin
alu_operator_o = ALU_FSR;
alu_multicycle_o = 1'b1;
if (instr_first_cycle_i) begin
- use_rs3 = 1'b0;
+ use_rs3_d = 1'b1;
end else begin
- use_rs3 = 1'b1;
+ use_rs3_d = 1'b0;
end
end else begin
unique case (instr_alu[31:27])
@@ -842,22 +883,26 @@
5'b0_0100: alu_operator_o = ALU_SRO; // Shift Right Ones by Immediate
5'b0_1001: alu_operator_o = ALU_SBEXT; // Extract bit specified by immediate.
5'b0_1100: begin
- alu_operator_o = ALU_ROR; // Rotate Right by Immediate
+ alu_operator_o = ALU_ROR; // Rotate Right by Immediate
alu_multicycle_o = 1'b1;
end
- 5'b0_1101: alu_operator_o = ALU_GREV; // General Reverse with Imm Control Val
- 5'b0_0101: alu_operator_o = ALU_GORC; // General Or-combine with Imm Control Val
+ 5'b0_1101: alu_operator_o = ALU_GREV; // General Reverse with Imm Control Val
+ 5'b0_0101: alu_operator_o = ALU_GORC; // General Or-combine with Imm Control Val
// Unshuffle with Immediate Control Value
- 5'b0_0001: if (instr_alu[26] == 1'b0) alu_operator_o = ALU_UNSHFL;
+ 5'b0_0001: begin
+ if (RV32B == RV32BFull) begin
+ if (instr_alu[26] == 1'b0) alu_operator_o = ALU_UNSHFL;
+ end
+ end
default: ;
endcase
end
end else begin
if (instr_alu[31:27] == 5'b0_0000) begin
- alu_operator_o = ALU_SRL; // Shift Right Logical by Immediate
+ alu_operator_o = ALU_SRL; // Shift Right Logical by Immediate
end else if (instr_alu[31:27] == 5'b0_1000) begin
- alu_operator_o = ALU_SRA; // Shift Right Arithmetically by Immediate
+ alu_operator_o = ALU_SRA; // Shift Right Arithmetically by Immediate
end
end
end
@@ -871,42 +916,42 @@
alu_op_b_mux_sel_o = OP_B_REG_B;
if (instr_alu[26]) begin
- if (RV32B) begin
+ if (RV32B != RV32BNone) begin
unique case ({instr_alu[26:25], instr_alu[14:12]})
{2'b11, 3'b001}: begin
alu_operator_o = ALU_CMIX; // cmix
alu_multicycle_o = 1'b1;
if (instr_first_cycle_i) begin
- use_rs3 = 1'b0;
+ use_rs3_d = 1'b1;
end else begin
- use_rs3 = 1'b1;
+ use_rs3_d = 1'b0;
end
end
{2'b11, 3'b101}: begin
alu_operator_o = ALU_CMOV; // cmov
alu_multicycle_o = 1'b1;
if (instr_first_cycle_i) begin
- use_rs3 = 1'b0;
+ use_rs3_d = 1'b1;
end else begin
- use_rs3 = 1'b1;
+ use_rs3_d = 1'b0;
end
end
{2'b10, 3'b001}: begin
alu_operator_o = ALU_FSL; // fsl
alu_multicycle_o = 1'b1;
if (instr_first_cycle_i) begin
- use_rs3 = 1'b0;
+ use_rs3_d = 1'b1;
end else begin
- use_rs3 = 1'b1;
+ use_rs3_d = 1'b0;
end
end
{2'b10, 3'b101}: begin
alu_operator_o = ALU_FSR; // fsr
alu_multicycle_o = 1'b1;
if (instr_first_cycle_i) begin
- use_rs3 = 1'b0;
+ use_rs3_d = 1'b1;
end else begin
- use_rs3 = 1'b1;
+ use_rs3_d = 1'b0;
end
end
default: ;
@@ -927,56 +972,67 @@
{7'b010_0000, 3'b101}: alu_operator_o = ALU_SRA; // Shift Right Arithmetic
// RV32B ALU Operations
- {7'b001_0000, 3'b001}: if (RV32B) alu_operator_o = ALU_SLO; // slo
- {7'b001_0000, 3'b101}: if (RV32B) alu_operator_o = ALU_SRO; // sro
+ {7'b001_0000, 3'b001}: if (RV32B != RV32BNone) alu_operator_o = ALU_SLO; // slo
+ {7'b001_0000, 3'b101}: if (RV32B != RV32BNone) alu_operator_o = ALU_SRO; // sro
{7'b011_0000, 3'b001}: begin
- if (RV32B) begin
+ if (RV32B != RV32BNone) begin
alu_operator_o = ALU_ROL; // rol
alu_multicycle_o = 1'b1;
end
end
{7'b011_0000, 3'b101}: begin
- if (RV32B) begin
+ if (RV32B != RV32BNone) begin
alu_operator_o = ALU_ROR; // ror
alu_multicycle_o = 1'b1;
end
end
- {7'b000_0101, 3'b100}: if (RV32B) alu_operator_o = ALU_MIN; // min
- {7'b000_0101, 3'b101}: if (RV32B) alu_operator_o = ALU_MAX; // max
- {7'b000_0101, 3'b110}: if (RV32B) alu_operator_o = ALU_MINU; // minu
- {7'b000_0101, 3'b111}: if (RV32B) alu_operator_o = ALU_MAXU; // maxu
+ {7'b000_0101, 3'b100}: if (RV32B != RV32BNone) alu_operator_o = ALU_MIN; // min
+ {7'b000_0101, 3'b101}: if (RV32B != RV32BNone) alu_operator_o = ALU_MAX; // max
+ {7'b000_0101, 3'b110}: if (RV32B != RV32BNone) alu_operator_o = ALU_MINU; // minu
+ {7'b000_0101, 3'b111}: if (RV32B != RV32BNone) alu_operator_o = ALU_MAXU; // maxu
- {7'b000_0100, 3'b100}: if (RV32B) alu_operator_o = ALU_PACK; // pack
- {7'b010_0100, 3'b100}: if (RV32B) alu_operator_o = ALU_PACKU; // packu
- {7'b000_0100, 3'b111}: if (RV32B) alu_operator_o = ALU_PACKH; // packh
+ {7'b000_0100, 3'b100}: if (RV32B != RV32BNone) alu_operator_o = ALU_PACK; // pack
+ {7'b010_0100, 3'b100}: if (RV32B != RV32BNone) alu_operator_o = ALU_PACKU; // packu
+ {7'b000_0100, 3'b111}: if (RV32B != RV32BNone) alu_operator_o = ALU_PACKH; // packh
- {7'b010_0000, 3'b100}: if (RV32B) alu_operator_o = ALU_XNOR; // xnor
- {7'b010_0000, 3'b110}: if (RV32B) alu_operator_o = ALU_ORN; // orn
- {7'b010_0000, 3'b111}: if (RV32B) alu_operator_o = ALU_ANDN; // andn
-
- // RV32B zbp
- {7'b011_0100, 3'b101}: if (RV32B) alu_operator_o = ALU_GREV; // grev
- {7'b001_0100, 3'b101}: if (RV32B) alu_operator_o = ALU_GORC; // grev
- {7'b000_0100, 3'b001}: if (RV32B) alu_operator_o = ALU_SHFL; // shfl
- {7'b000_0100, 3'b101}: if (RV32B) alu_operator_o = ALU_UNSHFL; // unshfl
+ {7'b010_0000, 3'b100}: if (RV32B != RV32BNone) alu_operator_o = ALU_XNOR; // xnor
+ {7'b010_0000, 3'b110}: if (RV32B != RV32BNone) alu_operator_o = ALU_ORN; // orn
+ {7'b010_0000, 3'b111}: if (RV32B != RV32BNone) alu_operator_o = ALU_ANDN; // andn
// RV32B zbs
- {7'b010_0100, 3'b001}: if (RV32B) alu_operator_o = ALU_SBCLR; // sbclr
- {7'b001_0100, 3'b001}: if (RV32B) alu_operator_o = ALU_SBSET; // sbset
- {7'b011_0100, 3'b001}: if (RV32B) alu_operator_o = ALU_SBINV; // sbinv
- {7'b010_0100, 3'b101}: if (RV32B) alu_operator_o = ALU_SBEXT; // sbext
+ {7'b010_0100, 3'b001}: if (RV32B != RV32BNone) alu_operator_o = ALU_SBCLR; // sbclr
+ {7'b001_0100, 3'b001}: if (RV32B != RV32BNone) alu_operator_o = ALU_SBSET; // sbset
+ {7'b011_0100, 3'b001}: if (RV32B != RV32BNone) alu_operator_o = ALU_SBINV; // sbinv
+ {7'b010_0100, 3'b101}: if (RV32B != RV32BNone) alu_operator_o = ALU_SBEXT; // sbext
+
+ // RV32B zbf
+ {7'b010_0100, 3'b111}: if (RV32B != RV32BNone) alu_operator_o = ALU_BFP; // bfp
+
+ // RV32B zbp
+ {7'b011_0100, 3'b101}: if (RV32B != RV32BNone) alu_operator_o = ALU_GREV; // grev
+ {7'b001_0100, 3'b101}: if (RV32B != RV32BNone) alu_operator_o = ALU_GORC; // grev
+ {7'b000_0100, 3'b001}: if (RV32B == RV32BFull) alu_operator_o = ALU_SHFL; // shfl
+ {7'b000_0100, 3'b101}: if (RV32B == RV32BFull) alu_operator_o = ALU_UNSHFL; // unshfl
// RV32B zbc
- {7'b000_0101, 3'b001}: if (RV32B) alu_operator_o = ALU_CLMUL; // clmul
- {7'b000_0101, 3'b010}: if (RV32B) alu_operator_o = ALU_CLMULR; // clmulr
- {7'b000_0101, 3'b011}: if (RV32B) alu_operator_o = ALU_CLMULH; // clmulh
+ {7'b000_0101, 3'b001}: if (RV32B == RV32BFull) alu_operator_o = ALU_CLMUL; // clmul
+ {7'b000_0101, 3'b010}: if (RV32B == RV32BFull) alu_operator_o = ALU_CLMULR; // clmulr
+ {7'b000_0101, 3'b011}: if (RV32B == RV32BFull) alu_operator_o = ALU_CLMULH; // clmulh
// RV32B zbe
- {7'b010_0100, 3'b110}: if (RV32B) alu_operator_o = ALU_BDEP; // bdep
- {7'b000_0100, 3'b110}: if (RV32B) alu_operator_o = ALU_BEXT; // bext
- // RV32B zbf
- {7'b010_0100, 3'b111}: if (RV32B) alu_operator_o = ALU_BFP; // bfp
+ {7'b010_0100, 3'b110}: begin
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_BDEP; // bdep
+ alu_multicycle_o = 1'b1;
+ end
+ end
+ {7'b000_0100, 3'b110}: begin
+ if (RV32B == RV32BFull) begin
+ alu_operator_o = ALU_BEXT; // bext
+ alu_multicycle_o = 1'b1;
+ end
+ end
// RV32M instructions, all use the same ALU operation
{7'b000_0001, 3'b000}: begin // mul
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_dummy_instr.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_dummy_instr.sv
index 5d606a3..99b75b6 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_dummy_instr.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_dummy_instr.sv
@@ -54,10 +54,21 @@
logic [6:0] dummy_set;
logic [2:0] dummy_opcode;
logic [31:0] dummy_instr;
+ logic [31:0] dummy_instr_seed_q, dummy_instr_seed_d;
// Shift the LFSR every time we insert an instruction
assign lfsr_en = insert_dummy_instr & id_in_ready_i;
+ assign dummy_instr_seed_d = dummy_instr_seed_q ^ dummy_instr_seed_i;
+
+ always_ff @(posedge clk_i or negedge rst_ni) begin
+ if (!rst_ni) begin
+ dummy_instr_seed_q <= '0;
+ end else if (dummy_instr_seed_en_i) begin
+ dummy_instr_seed_q <= dummy_instr_seed_d;
+ end
+ end
+
prim_lfsr #(
.LfsrDw ( 32 ),
.StateOutDw ( LFSR_OUT_W )
@@ -65,7 +76,7 @@
.clk_i ( clk_i ),
.rst_ni ( rst_ni ),
.seed_en_i ( dummy_instr_seed_en_i ),
- .seed_i ( dummy_instr_seed_i ),
+ .seed_i ( dummy_instr_seed_d ),
.lfsr_en_i ( lfsr_en ),
.entropy_i ( '0 ),
.state_o ( lfsr_state )
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_ex_block.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_ex_block.sv
index 73ffc88..eccc68e 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_ex_block.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_ex_block.sv
@@ -9,10 +9,10 @@
* Execution block: Hosts ALU and MUL/DIV unit
*/
module ibex_ex_block #(
- parameter bit RV32M = 1,
- parameter bit RV32B = 0,
- parameter bit BranchTargetALU = 0,
- parameter MultiplierImplementation = "fast"
+ parameter bit RV32M = 1,
+ parameter ibex_pkg::rv32b_e RV32B = ibex_pkg::RV32BNone,
+ parameter bit BranchTargetALU = 0,
+ parameter MultiplierImplementation = "fast"
) (
input logic clk_i,
input logic rst_ni,
@@ -41,9 +41,9 @@
input logic data_ind_timing_i,
// intermediate val reg
- output logic imd_val_we_o,
- output logic [33:0] imd_val_d_o,
- input logic [33:0] imd_val_q_i,
+ output logic [1:0] imd_val_we_o,
+ output logic [33:0] imd_val_d_o[2],
+ input logic [33:0] imd_val_q_i[2],
// Outputs
output logic [31:0] alu_adder_result_ex_o, // to LSU
@@ -63,10 +63,11 @@
logic alu_cmp_result, alu_is_equal_result;
logic multdiv_valid;
logic multdiv_sel;
- logic [31:0] alu_imd_val_d;
- logic alu_imd_val_we;
- logic [33:0] multdiv_imd_val_d;
- logic multdiv_imd_val_we;
+ logic [31:0] alu_imd_val_q[2];
+ logic [31:0] alu_imd_val_d[2];
+ logic [ 1:0] alu_imd_val_we;
+ logic [33:0] multdiv_imd_val_d[2];
+ logic [ 1:0] multdiv_imd_val_we;
/*
The multdiv_i output is never selected if RV32M=0
@@ -80,8 +81,11 @@
end
// Intermediate Value Register Mux
- assign imd_val_d_o = multdiv_sel ? multdiv_imd_val_d : {2'b0, alu_imd_val_d};
- assign imd_val_we_o = multdiv_sel ? multdiv_imd_val_we : alu_imd_val_we;
+ assign imd_val_d_o[0] = multdiv_sel ? multdiv_imd_val_d[0] : {2'b0, alu_imd_val_d[0]};
+ assign imd_val_d_o[1] = multdiv_sel ? multdiv_imd_val_d[1] : {2'b0, alu_imd_val_d[1]};
+ assign imd_val_we_o = multdiv_sel ? multdiv_imd_val_we : alu_imd_val_we;
+
+ assign alu_imd_val_q = '{imd_val_q_i[0][31:0], imd_val_q_i[1][31:0]};
assign result_ex_o = multdiv_sel ? multdiv_result : alu_result;
@@ -117,7 +121,7 @@
.operand_a_i ( alu_operand_a_i ),
.operand_b_i ( alu_operand_b_i ),
.instr_first_cycle_i ( alu_instr_first_cycle_i ),
- .imd_val_q_i ( imd_val_q_i[31:0] ),
+ .imd_val_q_i ( alu_imd_val_q ),
.imd_val_we_o ( alu_imd_val_we ),
.imd_val_d_o ( alu_imd_val_d ),
.multdiv_operand_a_i ( multdiv_alu_operand_a ),
@@ -218,6 +222,6 @@
// Multiplier/divider may require multiple cycles. The ALU output is valid in the same cycle
// unless the intermediate result register is being written (which indicates this isn't the
// final cycle of ALU operation).
- assign ex_valid_o = multdiv_sel ? multdiv_valid : !alu_imd_val_we;
+ assign ex_valid_o = multdiv_sel ? multdiv_valid : ~(|alu_imd_val_we);
endmodule
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_icache.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_icache.sv
index e606a41..3a28636 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_icache.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_icache.sv
@@ -56,9 +56,6 @@
input logic icache_inval_i,
output logic busy_o
);
-
- // NOTE RTL IS DRAFT
-
// Local constants
localparam int unsigned ADDR_W = 32;
// Number of fill buffers (must be >= 2)
@@ -427,8 +424,11 @@
assign ecc_err_ic1 = lookup_valid_ic1 & ((|data_err_ic1) | (|tag_err_ic1));
// Error correction
- // The way(s) producing the error will be invalidated in the next cycle.
- assign ecc_correction_ways_d = tag_err_ic1 | (tag_match_ic1 & {NumWays{|data_err_ic1}});
+ // All ways will be invalidated on a tag error to prevent X-propagation from data_err_ic1 on
+ // spurious hits. Also prevents the same line being allocated twice when there was a true
+ // hit and a spurious hit.
+ assign ecc_correction_ways_d = {NumWays{|tag_err_ic1}} |
+ (tag_match_ic1 & {NumWays{|data_err_ic1}});
assign ecc_correction_write_d = ecc_err_ic1;
always_ff @(posedge clk_i or negedge rst_ni) begin
@@ -1019,4 +1019,22 @@
`ASSERT_KNOWN(TagHitKnown, lookup_valid_ic1 & tag_hit_ic1)
`ASSERT_KNOWN(TagInvalidKnown, lookup_valid_ic1 & tag_invalid_ic1)
+ // This is only used for the Yosys-based formal flow. Once we have working bind support, we can
+ // get rid of it.
+`ifdef FORMAL
+ `ifdef YOSYS
+ // Unfortunately, Yosys doesn't support passing unpacked arrays as ports. Explicitly pack up the
+ // signals we need.
+ logic [NUM_FB-1:0][ADDR_W-1:0] packed_fill_addr_q;
+ always_comb begin
+ for (int i = 0; i < NUM_FB; i++) begin
+ packed_fill_addr_q[i][ADDR_W-1:0] = fill_addr_q[i];
+ end
+ end
+
+ `include "formal_tb_frag.svh"
+ `endif
+`endif
+
+
endmodule
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_id_stage.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_id_stage.sv
index ee63142..bba4c2a 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_id_stage.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_id_stage.sv
@@ -17,13 +17,13 @@
`include "prim_assert.sv"
module ibex_id_stage #(
- parameter bit RV32E = 0,
- parameter bit RV32M = 1,
- parameter bit RV32B = 0,
- parameter bit DataIndTiming = 1'b0,
- parameter bit BranchTargetALU = 0,
- parameter bit SpecBranch = 0,
- parameter bit WritebackStage = 0
+ parameter bit RV32E = 0,
+ parameter bit RV32M = 1,
+ parameter ibex_pkg::rv32b_e RV32B = ibex_pkg::RV32BNone,
+ parameter bit DataIndTiming = 1'b0,
+ parameter bit BranchTargetALU = 0,
+ parameter bit SpecBranch = 0,
+ parameter bit WritebackStage = 0
) (
input logic clk_i,
input logic rst_ni,
@@ -68,9 +68,9 @@
output logic [31:0] alu_operand_b_ex_o,
// Multicycle Operation Stage Register
- input logic imd_val_we_ex_i,
- input logic [33:0] imd_val_d_ex_i,
- output logic [33:0] imd_val_q_ex_o,
+ input logic [1:0] imd_val_we_ex_i,
+ input logic [33:0] imd_val_d_ex_i[2],
+ output logic [33:0] imd_val_q_ex_o[2],
// Branch target ALU
output logic [31:0] bt_a_operand_o,
@@ -247,7 +247,7 @@
logic alu_multicycle_dec;
logic stall_alu;
- logic [33:0] imd_val_q;
+ logic [33:0] imd_val_q[2];
op_a_sel_e bt_a_mux_sel;
imm_b_sel_e bt_b_mux_sel;
@@ -379,11 +379,13 @@
// Multicycle Operation Stage Register //
/////////////////////////////////////////
- always_ff @(posedge clk_i or negedge rst_ni) begin : intermediate_val_reg
- if (!rst_ni) begin
- imd_val_q <= '0;
- end else if (imd_val_we_ex_i) begin
- imd_val_q <= imd_val_d_ex_i;
+ for (genvar i=0; i<2; i++) begin : gen_intermediate_val_reg
+ always_ff @(posedge clk_i or negedge rst_ni) begin : intermediate_val_reg
+ if (!rst_ni) begin
+ imd_val_q[i] <= '0;
+ end else if (imd_val_we_ex_i[i]) begin
+ imd_val_q[i] <= imd_val_d_ex_i[i];
+ end
end
end
@@ -875,7 +877,8 @@
// precise exceptions)
// * There is a load/store request not being granted or which is unaligned and waiting to issue
// a second request (needs to stay in ID for the address calculation)
- assign stall_mem = instr_valid_i & (outstanding_memory_access | (lsu_req_dec & ~lsu_req_done_i));
+ assign stall_mem = instr_valid_i &
+ (outstanding_memory_access | (lsu_req_dec & ~lsu_req_done_i));
// If we stall a load in ID for any reason, it must not make an LSU request
// (otherwide we might issue two requests for the same instruction)
@@ -913,7 +916,8 @@
// Stall ID/EX as instruction in ID/EX cannot proceed to writeback yet
assign stall_wb = en_wb_o & ~ready_wb_i;
- assign perf_dside_wait_o = instr_valid_i & ~instr_kill & (outstanding_memory_access | stall_ld_hz);
+ assign perf_dside_wait_o = instr_valid_i & ~instr_kill &
+ (outstanding_memory_access | stall_ld_hz);
end else begin : gen_no_stall_mem
assign multicycle_done = lsu_req_dec ? lsu_resp_valid_i : ex_valid_i;
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_fast.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_fast.sv
index 53fd691..617bb51 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_fast.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_fast.sv
@@ -35,9 +35,9 @@
output logic [32:0] alu_operand_a_o,
output logic [32:0] alu_operand_b_o,
- input logic [33:0] imd_val_q_i,
- output logic [33:0] imd_val_d_o,
- output logic imd_val_we_o,
+ input logic [33:0] imd_val_q_i[2],
+ output logic [33:0] imd_val_d_o[2],
+ output logic [1:0] imd_val_we_o,
input logic multdiv_ready_id_i,
@@ -99,13 +99,11 @@
if (!rst_ni) begin
div_counter_q <= '0;
md_state_q <= MD_IDLE;
- op_denominator_q <= '0;
op_numerator_q <= '0;
op_quotient_q <= '0;
div_by_zero_q <= '0;
end else if (div_en_internal) begin
div_counter_q <= div_counter_d;
- op_denominator_q <= op_denominator_d;
op_numerator_q <= op_numerator_d;
op_quotient_q <= op_quotient_d;
md_state_q <= md_state_d;
@@ -113,18 +111,24 @@
end
end
-
`ASSERT_KNOWN(DivEnKnown, div_en_internal);
`ASSERT_KNOWN(MultEnKnown, mult_en_internal);
`ASSERT_KNOWN(MultDivEnKnown, multdiv_en);
assign multdiv_en = mult_en_internal | div_en_internal;
- assign imd_val_d_o = div_sel_i ? op_remainder_d : mac_res_d;
- assign imd_val_we_o = multdiv_en;
+ // Intermediate value register shared with ALU
+ assign imd_val_d_o[0] = div_sel_i ? op_remainder_d : mac_res_d;
+ assign imd_val_we_o[0] = multdiv_en;
+
+ assign imd_val_d_o[1] = {2'b0, op_denominator_d};
+ assign imd_val_we_o[1] = div_en_internal;
+ assign op_denominator_q = imd_val_q_i[1][31:0];
+ logic [1:0] unused_imd_val;
+ assign unused_imd_val = imd_val_q_i[1][33:32];
assign signed_mult = (signed_mode_i != 2'b00);
- assign multdiv_result_o = div_sel_i ? imd_val_q_i[31:0] : mac_res_d[31:0];
+ assign multdiv_result_o = div_sel_i ? imd_val_q_i[0][31:0] : mac_res_d[31:0];
// The single cycle multiplier uses three 17 bit multipliers to compute MUL instructions in a
// single cycle and MULH instructions in two cycles.
@@ -170,8 +174,8 @@
assign mult2_op_b = op_b_i[`OP_H];
// used in MULH
- assign accum[17:0] = imd_val_q_i[33:16];
- assign accum[33:18] = {16{signed_mult & imd_val_q_i[33]}};
+ assign accum[17:0] = imd_val_q_i[0][33:16];
+ assign accum[33:18] = {16{signed_mult & imd_val_q_i[0][33]}};
always_comb begin
// Default values == MULL
@@ -268,7 +272,7 @@
mult_op_b = op_b_i[`OP_L];
sign_a = 1'b0;
sign_b = 1'b0;
- accum = imd_val_q_i;
+ accum = imd_val_q_i[0];
mac_res_d = mac_res;
mult_state_d = mult_state_q;
mult_valid = 1'b0;
@@ -293,10 +297,10 @@
mult_op_b = op_b_i[`OP_H];
sign_a = 1'b0;
sign_b = signed_mode_i[1] & op_b_i[31];
- // result of AL*BL (in imd_val_q_i) always unsigned with no carry, so carries_q always 00
- accum = {18'b0, imd_val_q_i[31:16]};
+ // result of AL*BL (in imd_val_q_i[0]) always unsigned with no carry, so carries_q always 00
+ accum = {18'b0, imd_val_q_i[0][31:16]};
if (operator_i == MD_OP_MULL) begin
- mac_res_d = {2'b0, mac_res[`OP_L], imd_val_q_i[`OP_L]};
+ mac_res_d = {2'b0, mac_res[`OP_L], imd_val_q_i[0][`OP_L]};
end else begin
// MD_OP_MULH
mac_res_d = mac_res;
@@ -311,15 +315,15 @@
sign_a = signed_mode_i[0] & op_a_i[31];
sign_b = 1'b0;
if (operator_i == MD_OP_MULL) begin
- accum = {18'b0, imd_val_q_i[31:16]};
- mac_res_d = {2'b0, mac_res[15:0], imd_val_q_i[15:0]};
+ accum = {18'b0, imd_val_q_i[0][31:16]};
+ mac_res_d = {2'b0, mac_res[15:0], imd_val_q_i[0][15:0]};
mult_valid = 1'b1;
// Note no state transition will occur if mult_hold is set
mult_state_d = ALBL;
mult_hold = ~multdiv_ready_id_i;
end else begin
- accum = imd_val_q_i;
+ accum = imd_val_q_i[0];
mac_res_d = mac_res;
mult_state_d = AHBH;
end
@@ -332,8 +336,8 @@
mult_op_b = op_b_i[`OP_H];
sign_a = signed_mode_i[0] & op_a_i[31];
sign_b = signed_mode_i[1] & op_b_i[31];
- accum[17: 0] = imd_val_q_i[33:16];
- accum[33:18] = {16{signed_mult & imd_val_q_i[33]}};
+ accum[17: 0] = imd_val_q_i[0][33:16];
+ accum[33:18] = {16{signed_mult & imd_val_q_i[0][33]}};
// result of AH*BL is not signed only if signed_mode_i == 2'b00
mac_res_d = mac_res;
mult_valid = 1'b1;
@@ -366,7 +370,7 @@
// Divider
assign res_adder_h = alu_adder_ext_i[33:1];
- assign next_remainder = is_greater_equal ? res_adder_h[31:0] : imd_val_q_i[31:0];
+ assign next_remainder = is_greater_equal ? res_adder_h[31:0] : imd_val_q_i[0][31:0];
assign next_quotient = is_greater_equal ? {1'b0, op_quotient_q} | {1'b0, one_shift} :
{1'b0, op_quotient_q};
@@ -376,10 +380,10 @@
// Remainder - Divisor. If Remainder - Divisor >= 0, is_greater_equal is equal to 1,
// the next Remainder is Remainder - Divisor contained in res_adder_h and the
always_comb begin
- if ((imd_val_q_i[31] ^ op_denominator_q[31]) == 1'b0) begin
+ if ((imd_val_q_i[0][31] ^ op_denominator_q[31]) == 1'b0) begin
is_greater_equal = (res_adder_h[31] == 1'b0);
end else begin
- is_greater_equal = imd_val_q_i[31];
+ is_greater_equal = imd_val_q_i[0][31];
end
end
@@ -391,7 +395,7 @@
always_comb begin
div_counter_d = div_counter_q - 5'h1;
- op_remainder_d = imd_val_q_i;
+ op_remainder_d = imd_val_q_i[0];
op_quotient_d = op_quotient_q;
md_state_d = md_state_q;
op_numerator_d = op_numerator_q;
@@ -457,13 +461,13 @@
op_quotient_d = next_quotient[31:0];
md_state_d = (div_counter_q == 5'd1) ? MD_LAST : MD_COMP;
// Division
- alu_operand_a_o = {imd_val_q_i[31:0], 1'b1}; // it contains the remainder
+ alu_operand_a_o = {imd_val_q_i[0][31:0], 1'b1}; // it contains the remainder
alu_operand_b_o = {~op_denominator_q[31:0], 1'b1}; // -denominator two's compliment
end
MD_LAST: begin
if (operator_i == MD_OP_DIV) begin
- // this time we save the quotient in op_remainder_d (i.e. imd_val_q_i) since
+ // this time we save the quotient in op_remainder_d (i.e. imd_val_q_i[0]) since
// we do not need anymore the remainder
op_remainder_d = {1'b0, next_quotient};
end else begin
@@ -471,7 +475,7 @@
op_remainder_d = {2'b0, next_remainder[31:0]};
end
// Division
- alu_operand_a_o = {imd_val_q_i[31:0], 1'b1}; // it contains the remainder
+ alu_operand_a_o = {imd_val_q_i[0][31:0], 1'b1}; // it contains the remainder
alu_operand_b_o = {~op_denominator_q[31:0], 1'b1}; // -denominator two's compliment
md_state_d = MD_CHANGE_SIGN;
@@ -480,13 +484,13 @@
MD_CHANGE_SIGN: begin
md_state_d = MD_FINISH;
if (operator_i == MD_OP_DIV) begin
- op_remainder_d = (div_change_sign) ? {2'h0, alu_adder_i} : imd_val_q_i;
+ op_remainder_d = (div_change_sign) ? {2'h0, alu_adder_i} : imd_val_q_i[0];
end else begin
- op_remainder_d = (rem_change_sign) ? {2'h0, alu_adder_i} : imd_val_q_i;
+ op_remainder_d = (rem_change_sign) ? {2'h0, alu_adder_i} : imd_val_q_i[0];
end
// ABS(Quotient) = 0 - Quotient (or Remainder)
alu_operand_a_o = {32'h0 , 1'b1};
- alu_operand_b_o = {~imd_val_q_i[31:0], 1'b1};
+ alu_operand_b_o = {~imd_val_q_i[0][31:0], 1'b1};
end
MD_FINISH: begin
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_slow.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_slow.sv
index b3038cb..bcd04b0 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_slow.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_multdiv_slow.sv
@@ -31,9 +31,9 @@
output logic [32:0] alu_operand_a_o,
output logic [32:0] alu_operand_b_o,
- input logic [33:0] imd_val_q_i,
- output logic [33:0] imd_val_d_o,
- output logic imd_val_we_o,
+ input logic [33:0] imd_val_q_i[2],
+ output logic [33:0] imd_val_d_o[2],
+ output logic [1:0] imd_val_we_o,
input logic multdiv_ready_id_i,
@@ -50,7 +50,8 @@
md_fsm_e md_state_q, md_state_d;
logic [32:0] accum_window_q, accum_window_d;
- logic unused_imd_val;
+ logic unused_imd_val0;
+ logic [ 1:0] unused_imd_val1;
logic [32:0] res_adder_l;
logic [32:0] res_adder_h;
@@ -81,11 +82,16 @@
// ALU Operand MUX //
/////////////////////
- // Use shared intermediate value register in id_stage for accum_window
- assign imd_val_d_o = {1'b0,accum_window_d};
- assign imd_val_we_o = ~multdiv_hold;
- assign accum_window_q = imd_val_q_i[32:0];
- assign unused_imd_val = imd_val_q_i[33];
+ // Intermediate value register shared with ALU
+ assign imd_val_d_o[0] = {1'b0,accum_window_d};
+ assign imd_val_we_o[0] = ~multdiv_hold;
+ assign accum_window_q = imd_val_q_i[0][32:0];
+ assign unused_imd_val0 = imd_val_q_i[0][33];
+
+ assign imd_val_d_o[1] = {2'b00, op_numerator_d};
+ assign imd_val_we_o[1] = multdiv_en;
+ assign op_numerator_q = imd_val_q_i[1][31:0];
+ assign unused_imd_val1 = imd_val_q_i[1][33:32];
always_comb begin
alu_operand_a_o = accum_window_q;
@@ -328,14 +334,12 @@
multdiv_count_q <= 5'h0;
op_b_shift_q <= 33'h0;
op_a_shift_q <= 33'h0;
- op_numerator_q <= 32'h0;
md_state_q <= MD_IDLE;
div_by_zero_q <= 1'b0;
end else if (multdiv_en) begin
multdiv_count_q <= multdiv_count_d;
op_b_shift_q <= op_b_shift_d;
op_a_shift_q <= op_a_shift_d;
- op_numerator_q <= op_numerator_d;
md_state_q <= md_state_d;
div_by_zero_q <= div_by_zero_d;
end
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_pkg.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_pkg.sv
index 3ecd401..bb086ec 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_pkg.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_pkg.sv
@@ -8,6 +8,15 @@
*/
package ibex_pkg;
+/////////////////////////
+// RV32B Paramter Enum //
+/////////////////////////
+
+typedef enum integer {
+ RV32BNone,
+ RV32BBalanced,
+ RV32BFull
+} rv32b_e;
/////////////
// Opcodes //
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_tracer.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_tracer.sv
index 8a9aaf6..0e3e91c 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_tracer.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_tracer.sv
@@ -85,11 +85,11 @@
logic insn_is_compressed;
// Data items accessed during this instruction
- localparam RS1 = (1 << 0);
- localparam RS2 = (1 << 1);
- localparam RS3 = (1 << 2);
- localparam RD = (1 << 3);
- localparam MEM = (1 << 4);
+ localparam logic [4:0] RS1 = (1 << 0);
+ localparam logic [4:0] RS2 = (1 << 1);
+ localparam logic [4:0] RS3 = (1 << 2);
+ localparam logic [4:0] RD = (1 << 3);
+ localparam logic [4:0] MEM = (1 << 4);
logic [4:0] data_accessed;
function automatic void printbuffer_dumpline();
@@ -130,10 +130,10 @@
if ((data_accessed & MEM) != 0) begin
$fwrite(file_handle, " PA:0x%08x", rvfi_mem_addr);
- if (rvfi_mem_rmask != 4'b000) begin
+ if (rvfi_mem_rmask != 4'b0000) begin
$fwrite(file_handle, " store:0x%08x", rvfi_mem_wdata);
end
- if (rvfi_mem_wmask != 4'b000) begin
+ if (rvfi_mem_wmask != 4'b0000) begin
$fwrite(file_handle, " load:0x%08x", rvfi_mem_rdata);
end
end
diff --git a/hw/vendor/lowrisc_ibex/rtl/ibex_tracer_pkg.sv b/hw/vendor/lowrisc_ibex/rtl/ibex_tracer_pkg.sv
index 9d11c88..d42d4c9 100644
--- a/hw/vendor/lowrisc_ibex/rtl/ibex_tracer_pkg.sv
+++ b/hw/vendor/lowrisc_ibex/rtl/ibex_tracer_pkg.sv
@@ -11,50 +11,50 @@
parameter logic [1:0] OPCODE_C2 = 2'b10;
// instruction masks (for tracer)
-parameter logic [31:0] INSN_LUI = { 25'b?, {OPCODE_LUI } };
-parameter logic [31:0] INSN_AUIPC = { 25'b?, {OPCODE_AUIPC} };
-parameter logic [31:0] INSN_JAL = { 25'b?, {OPCODE_JAL } };
-parameter logic [31:0] INSN_JALR = { 17'b?, 3'b000, 5'b?, {OPCODE_JALR } };
+parameter logic [31:0] INSN_LUI = { 25'h?, {OPCODE_LUI } };
+parameter logic [31:0] INSN_AUIPC = { 25'h?, {OPCODE_AUIPC} };
+parameter logic [31:0] INSN_JAL = { 25'h?, {OPCODE_JAL } };
+parameter logic [31:0] INSN_JALR = { 17'h?, 3'b000, 5'h?, {OPCODE_JALR } };
// BRANCH
-parameter logic [31:0] INSN_BEQ = { 17'b?, 3'b000, 5'b?, {OPCODE_BRANCH} };
-parameter logic [31:0] INSN_BNE = { 17'b?, 3'b001, 5'b?, {OPCODE_BRANCH} };
-parameter logic [31:0] INSN_BLT = { 17'b?, 3'b100, 5'b?, {OPCODE_BRANCH} };
-parameter logic [31:0] INSN_BGE = { 17'b?, 3'b101, 5'b?, {OPCODE_BRANCH} };
-parameter logic [31:0] INSN_BLTU = { 17'b?, 3'b110, 5'b?, {OPCODE_BRANCH} };
-parameter logic [31:0] INSN_BGEU = { 17'b?, 3'b111, 5'b?, {OPCODE_BRANCH} };
-parameter logic [31:0] INSN_BALL = { 17'b?, 3'b010, 5'b?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BEQ = { 17'h?, 3'b000, 5'h?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BNE = { 17'h?, 3'b001, 5'h?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BLT = { 17'h?, 3'b100, 5'h?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BGE = { 17'h?, 3'b101, 5'h?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BLTU = { 17'h?, 3'b110, 5'h?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BGEU = { 17'h?, 3'b111, 5'h?, {OPCODE_BRANCH} };
+parameter logic [31:0] INSN_BALL = { 17'h?, 3'b010, 5'h?, {OPCODE_BRANCH} };
// OPIMM
-parameter logic [31:0] INSN_ADDI = { 17'b?, 3'b000, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SLTI = { 17'b?, 3'b010, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SLTIU = { 17'b?, 3'b011, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_XORI = { 17'b?, 3'b100, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_ORI = { 17'b?, 3'b110, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_ANDI = { 17'b?, 3'b111, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SLLI = { 7'b0000000, 10'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SRLI = { 7'b0000000, 10'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SRAI = { 7'b0100000, 10'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_ADDI = { 17'h?, 3'b000, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SLTI = { 17'h?, 3'b010, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SLTIU = { 17'h?, 3'b011, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_XORI = { 17'h?, 3'b100, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_ORI = { 17'h?, 3'b110, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_ANDI = { 17'h?, 3'b111, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SLLI = { 7'b0000000, 10'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SRLI = { 7'b0000000, 10'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SRAI = { 7'b0100000, 10'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
// OP
-parameter logic [31:0] INSN_ADD = { 7'b0000000, 10'b?, 3'b000, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SUB = { 7'b0100000, 10'b?, 3'b000, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SLL = { 7'b0000000, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SLT = { 7'b0000000, 10'b?, 3'b010, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SLTU = { 7'b0000000, 10'b?, 3'b011, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_XOR = { 7'b0000000, 10'b?, 3'b100, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SRL = { 7'b0000000, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SRA = { 7'b0100000, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_OR = { 7'b0000000, 10'b?, 3'b110, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_AND = { 7'b0000000, 10'b?, 3'b111, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_ADD = { 7'b0000000, 10'h?, 3'b000, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SUB = { 7'b0100000, 10'h?, 3'b000, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SLL = { 7'b0000000, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SLT = { 7'b0000000, 10'h?, 3'b010, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SLTU = { 7'b0000000, 10'h?, 3'b011, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_XOR = { 7'b0000000, 10'h?, 3'b100, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SRL = { 7'b0000000, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SRA = { 7'b0100000, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_OR = { 7'b0000000, 10'h?, 3'b110, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_AND = { 7'b0000000, 10'h?, 3'b111, 5'h?, {OPCODE_OP} };
// SYSTEM
-parameter logic [31:0] INSN_CSRRW = { 17'b?, 3'b001, 5'b?, {OPCODE_SYSTEM} };
-parameter logic [31:0] INSN_CSRRS = { 17'b?, 3'b010, 5'b?, {OPCODE_SYSTEM} };
-parameter logic [31:0] INSN_CSRRC = { 17'b?, 3'b011, 5'b?, {OPCODE_SYSTEM} };
-parameter logic [31:0] INSN_CSRRWI = { 17'b?, 3'b101, 5'b?, {OPCODE_SYSTEM} };
-parameter logic [31:0] INSN_CSRRSI = { 17'b?, 3'b110, 5'b?, {OPCODE_SYSTEM} };
-parameter logic [31:0] INSN_CSRRCI = { 17'b?, 3'b111, 5'b?, {OPCODE_SYSTEM} };
+parameter logic [31:0] INSN_CSRRW = { 17'h?, 3'b001, 5'h?, {OPCODE_SYSTEM} };
+parameter logic [31:0] INSN_CSRRS = { 17'h?, 3'b010, 5'h?, {OPCODE_SYSTEM} };
+parameter logic [31:0] INSN_CSRRC = { 17'h?, 3'b011, 5'h?, {OPCODE_SYSTEM} };
+parameter logic [31:0] INSN_CSRRWI = { 17'h?, 3'b101, 5'h?, {OPCODE_SYSTEM} };
+parameter logic [31:0] INSN_CSRRSI = { 17'h?, 3'b110, 5'h?, {OPCODE_SYSTEM} };
+parameter logic [31:0] INSN_CSRRCI = { 17'h?, 3'b111, 5'h?, {OPCODE_SYSTEM} };
parameter logic [31:0] INSN_ECALL = { 12'b000000000000, 13'b0, {OPCODE_SYSTEM} };
parameter logic [31:0] INSN_EBREAK = { 12'b000000000001, 13'b0, {OPCODE_SYSTEM} };
parameter logic [31:0] INSN_MRET = { 12'b001100000010, 13'b0, {OPCODE_SYSTEM} };
@@ -62,241 +62,241 @@
parameter logic [31:0] INSN_WFI = { 12'b000100000101, 13'b0, {OPCODE_SYSTEM} };
// RV32M
-parameter logic [31:0] INSN_DIV = { 7'b0000001, 10'b?, 3'b100, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_DIVU = { 7'b0000001, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_REM = { 7'b0000001, 10'b?, 3'b110, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_REMU = { 7'b0000001, 10'b?, 3'b111, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PMUL = { 7'b0000001, 10'b?, 3'b000, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PMUH = { 7'b0000001, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PMULHSU = { 7'b0000001, 10'b?, 3'b010, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PMULHU = { 7'b0000001, 10'b?, 3'b011, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_DIV = { 7'b0000001, 10'h?, 3'b100, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_DIVU = { 7'b0000001, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_REM = { 7'b0000001, 10'h?, 3'b110, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_REMU = { 7'b0000001, 10'h?, 3'b111, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PMUL = { 7'b0000001, 10'h?, 3'b000, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PMUH = { 7'b0000001, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PMULHSU = { 7'b0000001, 10'h?, 3'b010, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PMULHU = { 7'b0000001, 10'h?, 3'b011, 5'h?, {OPCODE_OP} };
// RV32B
// ZBB
-parameter logic [31:0] INSN_SLOI = { 5'b00100 , 12'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SROI = { 5'b00100 , 12'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_RORI = { 5'b01100 , 12'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CLZ = { 12'b011000000000, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CTZ = { 12'b011000000001, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_PCNT = { 12'b011000000010, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SEXTB = { 12'b011000000100, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SEXTH = { 12'b011000000101, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SLOI = { 5'b00100 , 12'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SROI = { 5'b00100 , 12'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_RORI = { 5'b01100 , 12'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CLZ = { 12'b011000000000, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CTZ = { 12'b011000000001, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_PCNT = { 12'b011000000010, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SEXTB = { 12'b011000000100, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SEXTH = { 12'b011000000101, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
// sext -- pseudoinstruction: andi rd, rs 255
-parameter logic [31:0] INSN_ZEXTB = { 4'b0000, 8'b11111111, 5'b?, 3'b111, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_ZEXTB = { 4'b0000, 8'b11111111, 5'h?, 3'b111, 5'h?, {OPCODE_OP_IMM} };
// sext -- pseudoinstruction: pack rd, rs zero
-parameter logic [31:0] INSN_ZEXTH = { 7'b0000100, 5'b00000, 5'b?, 3'b100, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_ZEXTH = { 7'b0000100, 5'b00000, 5'h?, 3'b100, 5'h?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SLO = { 7'b0010000, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SRO = { 7'b0010000, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_ROL = { 7'b0110000, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_ROR = { 7'b0110000, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_MIN = { 7'b0000101, 10'b?, 3'b100, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_MAX = { 7'b0000101, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_MINU = { 7'b0000101, 10'b?, 3'b110, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_MAXU = { 7'b0000101, 10'b?, 3'b111, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_XNOR = { 7'b0100000, 10'b?, 3'b100, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_ORN = { 7'b0100000, 10'b?, 3'b110, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_ANDN = { 7'b0100000, 10'b?, 3'b111, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PACK = { 7'b0000100, 10'b?, 3'b100, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PACKU = { 7'b0100100, 10'b?, 3'b100, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_PACKH = { 7'b0000100, 10'b?, 3'b111, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SLO = { 7'b0010000, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SRO = { 7'b0010000, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_ROL = { 7'b0110000, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_ROR = { 7'b0110000, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_MIN = { 7'b0000101, 10'h?, 3'b100, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_MAX = { 7'b0000101, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_MINU = { 7'b0000101, 10'h?, 3'b110, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_MAXU = { 7'b0000101, 10'h?, 3'b111, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_XNOR = { 7'b0100000, 10'h?, 3'b100, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_ORN = { 7'b0100000, 10'h?, 3'b110, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_ANDN = { 7'b0100000, 10'h?, 3'b111, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PACK = { 7'b0000100, 10'h?, 3'b100, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PACKU = { 7'b0100100, 10'h?, 3'b100, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_PACKH = { 7'b0000100, 10'h?, 3'b111, 5'h?, {OPCODE_OP} };
// ZBS
-parameter logic [31:0] INSN_SBCLRI = { 5'b01001, 12'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SBSETI = { 5'b00101, 12'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SBINVI = { 5'b01101, 12'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SBEXTI = { 5'b01001, 12'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SBCLRI = { 5'b01001, 12'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SBSETI = { 5'b00101, 12'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SBINVI = { 5'b01101, 12'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SBEXTI = { 5'b01001, 12'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_SBCLR = { 7'b0100100, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SBSET = { 7'b0010100, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SBINV = { 7'b0110100, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SBEXT = { 7'b0100100, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SBCLR = { 7'b0100100, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SBSET = { 7'b0010100, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SBINV = { 7'b0110100, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SBEXT = { 7'b0100100, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
// ZBP
// grevi
-parameter logic [31:0] INSN_GREVI = { 5'b01101, 12'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_GREVI = { 5'b01101, 12'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
// grevi -- pseudo-instructions
parameter logic [31:0] INSN_REV_P =
- { 5'b01101, 2'b?, 5'b00001, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b00001, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV2_N =
- { 5'b01101, 2'b?, 5'b00010, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b00010, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV_N =
- { 5'b01101, 2'b?, 5'b00011, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b00011, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV4_B =
- { 5'b01101, 2'b?, 5'b00100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b00100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV2_B =
- { 5'b01101, 2'b?, 5'b00110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b00110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV_B =
- { 5'b01101, 2'b?, 5'b00111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b00111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV8_H =
- { 5'b01101, 2'b?, 5'b01000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b01000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV4_H =
- { 5'b01101, 2'b?, 5'b01100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b01100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV2_H =
- { 5'b01101, 2'b?, 5'b01110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b01110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV_H =
- { 5'b01101, 2'b?, 5'b01111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b01111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV16 =
- { 5'b01101, 2'b?, 5'b01000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b01000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV8 =
- { 5'b01101, 2'b?, 5'b11000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b11000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV4 =
- { 5'b01101, 2'b?, 5'b11100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b11100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV2 =
- { 5'b01101, 2'b?, 5'b11110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b11110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_REV =
- { 5'b01101, 2'b?, 5'b11111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b01101, 2'h?, 5'b11111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
// gorci
-parameter logic [31:0] INSN_GORCI = { 5'b00101, 12'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_GORCI = { 5'b00101, 12'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
// gorci -- pseudo-instructions
parameter logic [31:0] INSN_ORC_P =
- { 5'b00101, 2'b?, 5'b00001, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b00001, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC2_N =
- { 5'b00101, 2'b?, 5'b00010, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b00010, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC_N =
- { 5'b00101, 2'b?, 5'b00011, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b00011, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC4_B =
- { 5'b00101, 2'b?, 5'b00100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b00100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC2_B =
- { 5'b00101, 2'b?, 5'b00110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b00110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC_B =
- { 5'b00101, 2'b?, 5'b00111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b00111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC8_H =
- { 5'b00101, 2'b?, 5'b01000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b01000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC4_H =
- { 5'b00101, 2'b?, 5'b01100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b01100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC2_H =
- { 5'b00101, 2'b?, 5'b01110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b01110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC_H =
- { 5'b00101, 2'b?, 5'b01111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b01111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC16 =
- { 5'b00101, 2'b?, 5'b01000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b01000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC8 =
- { 5'b00101, 2'b?, 5'b11000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b11000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC4 =
- { 5'b00101, 2'b?, 5'b11100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b11100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC2 =
- { 5'b00101, 2'b?, 5'b11110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b11110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ORC =
- { 5'b00101, 2'b?, 5'b11111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00101, 2'h?, 5'b11111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
// shfli
-parameter logic [31:0] INSN_SHFLI = { 6'b000010, 11'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_SHFLI = { 6'b000010, 11'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
// shfli -- pseudo-instructions
parameter logic [31:0] INSN_ZIP_N =
- { 5'b00010, 3'b?, 4'b0001, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0001, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP2_B =
- { 5'b00010, 3'b?, 4'b0010, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0010, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP_B =
- { 5'b00010, 3'b?, 4'b0011, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0011, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP4_H =
- { 5'b00010, 3'b?, 4'b0100, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0100, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP2_H =
- { 5'b00010, 3'b?, 4'b0110, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0110, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP_H =
- { 5'b00010, 3'b?, 4'b0111, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0111, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP8 =
- { 5'b00010, 3'b?, 4'b1000, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1000, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP4 =
- { 5'b00010, 3'b?, 4'b1100, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1100, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP2 =
- { 5'b00010, 3'b?, 4'b1110, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1110, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_ZIP =
- { 5'b00010, 3'b?, 4'b1111, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1111, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
// unshfli
-parameter logic [31:0] INSN_UNSHFLI = { 6'b000010, 11'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_UNSHFLI = { 6'b000010, 11'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
// unshfli -- pseudo-instructions
parameter logic [31:0] INSN_UNZIP_N =
- { 5'b00010, 3'b?, 4'b0001, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0001, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP2_B =
- { 5'b00010, 3'b?, 4'b0010, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0010, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP_B =
- { 5'b00010, 3'b?, 4'b0011, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0011, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP4_H =
- { 5'b00010, 3'b?, 4'b0100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP2_H =
- { 5'b00010, 3'b?, 4'b0110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP_H =
- { 5'b00010, 3'b?, 4'b0111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b0111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP8 =
- { 5'b00010, 3'b?, 4'b1000, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1000, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP4 =
- { 5'b00010, 3'b?, 4'b1100, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1100, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP2 =
- { 5'b00010, 3'b?, 4'b1110, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1110, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
parameter logic [31:0] INSN_UNZIP =
- { 5'b00010, 3'b?, 4'b1111, 5'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+ { 5'b00010, 3'h?, 4'b1111, 5'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_GREV = { 7'b0110100, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_GORC = { 7'b0010100, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_SHFL = { 7'b0000100, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_UNSHFL = { 7'b0000100, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_GREV = { 7'b0110100, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_GORC = { 7'b0010100, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_SHFL = { 7'b0000100, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_UNSHFL = { 7'b0000100, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
// ZBE
-parameter logic [31:0] INSN_BDEP = {7'b0100100, 10'b?, 3'b110, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_BEXT = {7'b0000100, 10'b?, 3'b110, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_BDEP = {7'b0100100, 10'h?, 3'b110, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_BEXT = {7'b0000100, 10'h?, 3'b110, 5'h?, {OPCODE_OP} };
// ZBT
-parameter logic [31:0] INSN_FSRI = { 5'b?, 1'b1, 11'b?, 3'b101, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_FSRI = { 5'h?, 1'b1, 11'h?, 3'b101, 5'h?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CMIX = {5'b?, 2'b11, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_CMOV = {5'b?, 2'b11, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_FSL = {5'b?, 2'b10, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_FSR = {5'b?, 2'b10, 10'b?, 3'b101, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_CMIX = {5'h?, 2'b11, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_CMOV = {5'h?, 2'b11, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_FSL = {5'h?, 2'b10, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_FSR = {5'h?, 2'b10, 10'h?, 3'b101, 5'h?, {OPCODE_OP} };
// ZBF
-parameter logic [31:0] INSN_BFP = {7'b0100100, 10'b?, 3'b111, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_BFP = {7'b0100100, 10'h?, 3'b111, 5'h?, {OPCODE_OP} };
// ZBC
-parameter logic [31:0] INSN_CLMUL = {7'b0000101, 10'b?, 3'b001, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_CLMULR = {7'b0000101, 10'b?, 3'b010, 5'b?, {OPCODE_OP} };
-parameter logic [31:0] INSN_CLMULH = {7'b0000101, 10'b?, 3'b011, 5'b?, {OPCODE_OP} };
+parameter logic [31:0] INSN_CLMUL = {7'b0000101, 10'h?, 3'b001, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_CLMULR = {7'b0000101, 10'h?, 3'b010, 5'h?, {OPCODE_OP} };
+parameter logic [31:0] INSN_CLMULH = {7'b0000101, 10'h?, 3'b011, 5'h?, {OPCODE_OP} };
// ZBR
-parameter logic [31:0] INSN_CRC32_B = {7'b0110000, 5'b10000, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CRC32_H = {7'b0110000, 5'b10001, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CRC32_W = {7'b0110000, 5'b10010, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CRC32C_B = {7'b0110000, 5'b11000, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CRC32C_H = {7'b0110000, 5'b11001, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
-parameter logic [31:0] INSN_CRC32C_W = {7'b0110000, 5'b11010, 5'b?, 3'b001, 5'b?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CRC32_B = {7'b0110000, 5'b10000, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CRC32_H = {7'b0110000, 5'b10001, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CRC32_W = {7'b0110000, 5'b10010, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CRC32C_B = {7'b0110000, 5'b11000, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CRC32C_H = {7'b0110000, 5'b11001, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
+parameter logic [31:0] INSN_CRC32C_W = {7'b0110000, 5'b11010, 5'h?, 3'b001, 5'h?, {OPCODE_OP_IMM} };
// LOAD & STORE
-parameter logic [31:0] INSN_LOAD = {25'b?, {OPCODE_LOAD } };
-parameter logic [31:0] INSN_STORE = {25'b?, {OPCODE_STORE} };
+parameter logic [31:0] INSN_LOAD = {25'h?, {OPCODE_LOAD } };
+parameter logic [31:0] INSN_STORE = {25'h?, {OPCODE_STORE} };
// MISC-MEM
-parameter logic [31:0] INSN_FENCE = { 17'b?, 3'b000, 5'b?, {OPCODE_MISC_MEM} };
-parameter logic [31:0] INSN_FENCEI = { 17'b0, 3'b001, 5'b0, {OPCODE_MISC_MEM} };
+parameter logic [31:0] INSN_FENCE = { 17'h?, 3'b000, 5'h?, {OPCODE_MISC_MEM} };
+parameter logic [31:0] INSN_FENCEI = { 17'h0, 3'b001, 5'h0, {OPCODE_MISC_MEM} };
// Compressed Instructions
// C0
-parameter logic [15:0] INSN_CADDI4SPN = { 3'b000, 11'b?, {OPCODE_C0} };
-parameter logic [15:0] INSN_CLW = { 3'b010, 11'b?, {OPCODE_C0} };
-parameter logic [15:0] INSN_CSW = { 3'b110, 11'b?, {OPCODE_C0} };
+parameter logic [15:0] INSN_CADDI4SPN = { 3'b000, 11'h?, {OPCODE_C0} };
+parameter logic [15:0] INSN_CLW = { 3'b010, 11'h?, {OPCODE_C0} };
+parameter logic [15:0] INSN_CSW = { 3'b110, 11'h?, {OPCODE_C0} };
// C1
-parameter logic [15:0] INSN_CADDI = { 3'b000, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CJAL = { 3'b001, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CJ = { 3'b101, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CLI = { 3'b010, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CLUI = { 3'b011, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CBEQZ = { 3'b110, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CBNEZ = { 3'b111, 11'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CSRLI = { 3'b100, 1'b?, 2'b00, 8'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CSRAI = { 3'b100, 1'b?, 2'b01, 8'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CANDI = { 3'b100, 1'b?, 2'b10, 8'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CSUB = { 3'b100, 1'b0, 2'b11, 3'b?, 2'b00, 3'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CXOR = { 3'b100, 1'b0, 2'b11, 3'b?, 2'b01, 3'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_COR = { 3'b100, 1'b0, 2'b11, 3'b?, 2'b10, 3'b?, {OPCODE_C1} };
-parameter logic [15:0] INSN_CAND = { 3'b100, 1'b0, 2'b11, 3'b?, 2'b11, 3'b?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CADDI = { 3'b000, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CJAL = { 3'b001, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CJ = { 3'b101, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CLI = { 3'b010, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CLUI = { 3'b011, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CBEQZ = { 3'b110, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CBNEZ = { 3'b111, 11'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CSRLI = { 3'b100, 1'h?, 2'b00, 8'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CSRAI = { 3'b100, 1'h?, 2'b01, 8'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CANDI = { 3'b100, 1'h?, 2'b10, 8'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CSUB = { 3'b100, 1'b0, 2'b11, 3'h?, 2'b00, 3'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CXOR = { 3'b100, 1'b0, 2'b11, 3'h?, 2'b01, 3'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_COR = { 3'b100, 1'b0, 2'b11, 3'h?, 2'b10, 3'h?, {OPCODE_C1} };
+parameter logic [15:0] INSN_CAND = { 3'b100, 1'b0, 2'b11, 3'h?, 2'b11, 3'h?, {OPCODE_C1} };
// C2
-parameter logic [15:0] INSN_CSLLI = { 3'b000, 11'b?, {OPCODE_C2} };
-parameter logic [15:0] INSN_CLWSP = { 3'b010, 11'b?, {OPCODE_C2} };
-parameter logic [15:0] INSN_SWSP = { 3'b110, 11'b?, {OPCODE_C2} };
-parameter logic [15:0] INSN_CMV = { 3'b100, 1'b0, 10'b?, {OPCODE_C2} };
-parameter logic [15:0] INSN_CADD = { 3'b100, 1'b1, 10'b?, {OPCODE_C2} };
-parameter logic [15:0] INSN_CEBREAK = { 3'b100, 1'b1, 5'b0, 5'b0, {OPCODE_C2} };
-parameter logic [15:0] INSN_CJR = { 3'b100, 1'b0, 5'b?, 5'b0, {OPCODE_C2} };
-parameter logic [15:0] INSN_CJALR = { 3'b100, 1'b1, 5'b?, 5'b0, {OPCODE_C2} };
+parameter logic [15:0] INSN_CSLLI = { 3'b000, 11'h?, {OPCODE_C2} };
+parameter logic [15:0] INSN_CLWSP = { 3'b010, 11'h?, {OPCODE_C2} };
+parameter logic [15:0] INSN_SWSP = { 3'b110, 11'h?, {OPCODE_C2} };
+parameter logic [15:0] INSN_CMV = { 3'b100, 1'b0, 10'h?, {OPCODE_C2} };
+parameter logic [15:0] INSN_CADD = { 3'b100, 1'b1, 10'h?, {OPCODE_C2} };
+parameter logic [15:0] INSN_CEBREAK = { 3'b100, 1'b1, 5'h0, 5'h0, {OPCODE_C2} };
+parameter logic [15:0] INSN_CJR = { 3'b100, 1'b0, 5'h0, 5'h0, {OPCODE_C2} };
+parameter logic [15:0] INSN_CJALR = { 3'b100, 1'b1, 5'h?, 5'h0, {OPCODE_C2} };
endpackage