[otbn] Add OTBN encoding information to insns.yml This patch defines a way to write "encoding schemes" in the YAML. This is slightly inspired by the LLVM approach [1] but massively simplified. I initially had something even simpler, without allowing any hierarchy in the encoding schemes, but it was quite ugly to fill in the instruction encodings and I was worried that it would make it very difficult to change the encodings if we needed to later. Instructions now have encodings defined, and we have code in the Python parser that resolves the (hierarchical) scheme used by an instruction into a flat one, then matches up the named operands with the fields in the encoding scheme. We finally check that the encoding is not ambiguous: that no bit pattern matches more than one instruction. Once this is all done, the documentation generator now spits out an encoding table next to each instruction. Final encoding work by Stefan. [1] See e.g. llvm/lib/Target/RISCV/RISCVInstrFormatsC.td in the LLVM source tree. Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org> Co-authored-by: Stefan Wallentowitz <stefan.wallentowitz@gi-de.com>

commit: 56a2ac54ccaa5b679f10c555925558114e05ca78 [log] [tgz]
author: Rupert Swarbrick <rswarbrick@lowrisc.org> Tue Jul 14 09:57:04 2020 +0100
committer: Rupert Swarbrick <rswarbrick@gmail.com> Thu Jul 16 16:05:30 2020 +0100
tree: 6262730f3fcf6e79990b417518db4531bd369800
parent: 6302a55d979f296314309561dc490c56b961f529 [diff]
diff --git a/hw/ip/otbn/data/insns.yml b/hw/ip/otbn/data/insns.yml
index fd5c658..7c6b50a 100644
--- a/hw/ip/otbn/data/insns.yml
+++ b/hw/ip/otbn/data/insns.yml

@@ -29,6 +29,353 @@
     doc: |
       All Big Number (BN) instructions operate on the wide register file WREG.
 
+# Instruction encoding schemes
+#
+# These define the mapping between instruction operands and bits in the
+# encoding. A scheme names zero or more named fields. It can also inherit from
+# zero or more other schemes.
+#
+# The direct fields of a scheme are defined as a dictionary, mapping a field
+# name (which will be matched up with instruction operands) to a value. In
+# general, this value is itself a dictionary with the following keys:
+#
+#  bits: A list of ranges of bits. A range is written <msb>-<lsb>, where both
+#        are integers (and msb >= lsb). Multiple ranges can be separated by
+#        commas. A degenerate range (with msb == lsb) can be written as a bare
+#        integer. Required.
+#
+#  value: Optional. If specified, this should be a binary string for a fixed
+#         value for this field, prefixed with a "b" (to avoid the YAML parser
+#         reading it as a decimal number). Underscores in the string are
+#         ignored (to make it easier to show grouping) and 'x' means don't
+#         care.
+#
+#  shift: Optional. If specified, this is the number of bits to shift the
+#         encoded value left to get the logical value.
+#
+# For brevity, if value and shift have their default values, the bits string
+# can be used as the value for the field.
+#
+# A scheme can inherit from other schemes by listing their names in a 'parents'
+# attribute. If the child scheme needs to set the value of a parents' field to
+# something fixed, it can do so with the following syntax:
+#
+#     parent_name(field_name=b11101, field_name2=b111)
+#
+# The fields of a scheme are recursively defined to be its direct fields plus
+# the fields all its ancestors.
+#
+# A scheme is called complete if its fields cover the entire range of bits
+# (0-31) and partial otherwise.
+
+encoding-schemes:
+  # A partial scheme that sets the bottom two bits to 2'b11 (as for all RISC-V
+  # uncompressed instructions) and defines an 'opcode' field for bits 6-2
+  # (standard for RV32I instructions)
+  rv:
+    fields:
+      opcode: 6-2
+      uncomp:
+        bits: 1-0
+        value: b11
+
+  # A partial scheme defining a funct3 field in bits 14-12 (used in most RV32I
+  # instructions, and most BN.* custom instructions)
+  funct3:
+    fields:
+      funct3: 14-12
+
+  # RISC-V "R-type" encoding (reg <- fun(reg, reg))
+  R:
+    parents:
+      - rv
+      - funct3
+    fields:
+      funct7: 31-25
+      rs2: 24-20
+      rs1: 19-15
+      rd: 11-7
+
+  # RISC-V "I-type" encoding (reg <- fun(imm, reg))
+  I:
+    parents:
+      - rv
+      - funct3
+    fields:
+      imm: 31-20
+      rs1: 19-15
+      rd: 11-7
+
+  # RISC-V "S-type" encoding (_ <- fun(reg, imm))
+  S:
+    parents:
+      - rv
+      - funct3
+    fields:
+      imm: 31-25,11-7
+      rs2: 24-20
+      rs1: 19-15
+
+  # RISC-V "B-type" encoding (like S, but different immediate layout; used for
+  # branches)
+  B:
+    parents:
+      - rv
+      - funct3
+    fields:
+      imm:
+        bits: 31,7,30-25,11-8
+        shift: 1
+      rs2: 24-20
+      rs1: 19-15
+
+  # RISC-V "U-type" encoding (reg <- fun(imm))
+  U:
+    parents:
+      - rv
+    fields:
+      imm:
+        bits: 31-12
+        shift: 12
+      rd: 11-7
+
+  # RISC-V "J-type" encoding (like U, but different immediate layout; used for
+  # jumps)
+  J:
+    parents:
+      - rv
+    fields:
+      imm:
+        bits: 31,19-12,20,30-21
+        shift: 1
+      rd: 11-7
+
+  # A partial scheme for custom instructions with opcode b00010
+  custom0:
+    parents:
+      - rv(opcode=b00010)
+
+  # A partial scheme for custom instructions with opcode b01010
+  custom1:
+    parents:
+      - rv(opcode=b01010)
+
+  # A partial scheme for custom instructions with opcode b01110
+  custom2:
+    parents:
+      - rv(opcode=b01110)
+
+  # A partial scheme for custom instructions with opcode b11110
+  custom3:
+    parents:
+      - rv(opcode=b11110)
+
+  # A partial scheme for instructions that produce a dest WDR.
+  wrd:
+    fields:
+      wrd: 11-7
+
+  # A partial scheme for instructions that take two source WDRs and produce a
+  # dest WDR.
+  wdr3:
+    parents:
+      - wrd
+    fields:
+      wrs2: 24-20
+      wrs1: 19-15
+
+  # A partial scheme that defines the 'fg' field (for <flag_group> operands)
+  fg:
+    fields:
+      fg: 31
+
+  # A partial scheme that defines the shift fields (type and bytes)
+  shift:
+    fields:
+      shift_type: 30
+      shift_bytes: 29-25
+
+  # A partial scheme that defines a function field at bit 31 for OTBN logical
+  # operations
+  funct31:
+    fields:
+      funct31: 31
+
+  # A partial scheme for specialized 2 bit function field, we need a reduced
+  # size in the lower two bits of funct3 as RSHI spills over 1 bit from its
+  # immediate
+  funct2:
+    fields:
+      funct2: 13-12
+
+  # A specialised encoding for the loop instruction (only one source, no
+  # destination)
+  loop:
+    parents:
+      - custom3
+      - funct2(funct2=b00)
+    fields:
+      bodysize: 31-20
+      grs: 19-15
+      fixed:
+        bits: 14,11-7
+        value: bxxxxxx
+
+  # A specialised encoding for the loopi instruction (which, unusually, has 2
+  # immediates)
+  loopi:
+    parents:
+      - custom3
+      - funct2(funct2=b01)
+    fields:
+      bodysize: 31-20
+      iterations: 19-15,11-7
+      fixed:
+        bits: 14
+        value: bx
+
+  # Used wide logical operations (bn.and, bn.or, bn.xor).
+  bna:
+    parents:
+      - custom1
+      - wdr3
+      - funct3
+      - shift
+      - funct31
+
+  # Used for bn.not (no second source reg).
+  bnan:
+    parents:
+      - custom1
+      - shift
+      - funct31
+      - wrd
+    fields:
+      wrs1: 24-20
+      fixed:
+        bits: 19-15
+        value: bxxxxx
+
+  # Used for the wide reg/reg ALU instructions.
+  bnaf:
+    parents:
+      - custom1
+      - wdr3
+      - funct3
+      - shift
+      - fg
+
+  # Used for the wide bn.addi and bn.subi instructions.
+  bnai:
+    parents:
+      - custom1
+      - wrd
+      - funct3
+      - fg
+    fields:
+      sub: 30
+      imm: 29-20
+      wrs: 19-15
+
+  # Used for bn.addm, bn.subm
+  bnam:
+    parents:
+      - custom1
+      - wdr3
+      - funct3
+    fields:
+      sub: 30
+      fixed:
+        bits: 31,29-25
+        value: bxxxxxx
+
+  # Used for bn.mulqacc
+  bnaq:
+    parents:
+      - custom2
+      - wdr3
+    fields:
+      wb: 31-30
+      dh: 29
+      qs2: 28-27
+      qs1: 26-25
+      acc: 14-13
+      z: 12
+
+  # Unusual scheme used for bn.rshi (the immediate bleeds into the usual funct3
+  # field)
+  bnr:
+    parents:
+      - custom3
+      - wdr3
+    fields:
+      imm: 31-25,14
+      funct2: 13-12
+
+  # Used by bn.sel.
+  bns:
+    parents:
+      - custom0
+      - wdr3
+      - funct3(funct3=b000)
+      - fg
+    fields:
+      fixed:
+        bits: 30-27
+        value: bxxxx
+      flag: 26-25
+
+  # Used by bn.cmp and bn.cmpb
+  bnc:
+    parents:
+      - custom0
+      - wdr3(wrd=bxxxxx)
+      - funct3
+      - shift
+      - fg
+
+  # Used by bn.lid and bn.sid
+  bnxid:
+    parents:
+      - custom0
+      - funct3
+    fields:
+      imm:
+        bits: 24-22,31-25
+        shift: 4
+      spp: 21
+      dpp: 20
+      rs: 19-15
+      rd: 11-7
+
+  # Used by bn.mov and bn.movr
+  bnmov:
+    parents:
+      - custom0
+      - funct3(funct3=b110)
+    fields:
+      indirect: 31
+      fixed_top:
+        bits: 30-22
+        value: bxxxxxxxxx
+      spp: 21
+      dpp: 20
+      src: 19-15
+      dst: 11-7
+
+  # Used by bn.wsrrs and bn.wsrrw
+  wcsr:
+    parents:
+      - custom0
+      - funct3(funct3=b111)
+    fields:
+      write: 31
+      wcsr: 27-20
+      wrs: 19-15
+      wrd: 11-7
+      fixed:
+        bits: 30-28
+        value: bxxx
 
 # The instructions. Instructions are listed in the given order within
 # each instruction group. There are the following fields:
@@ -123,73 +470,260 @@
     rv32i: true
     synopsis: Add
     operands: [grd, grs1, grs2]
+    encoding:
+      scheme: R
+      mapping:
+        funct7: b0000000
+        rs2: grs2
+        rs1: grs1
+        funct3: b000
+        rd: grd
+        opcode: b01100
 
   - mnemonic: addi
     rv32i: true
     synopsis: Add Immediate
     operands: [grd, grs1, imm]
+    encoding:
+      scheme: I
+      mapping:
+        imm: imm
+        rs1: grs1
+        funct3: b000
+        rd: grd
+        opcode: b00100
 
   - mnemonic: lui
     rv32i: true
     synopsis: Load Upper Immediate
     operands: [grd, imm]
+    encoding:
+      scheme: U
+      mapping:
+        imm: imm
+        rd: grd
+        opcode: b01101
 
   - mnemonic: sub
     rv32i: true
     synopsis: Subtract
     operands: [grd, grs1, grs2]
+    encoding:
+      scheme: R
+      mapping:
+        funct7: b0100000
+        rs2: grs2
+        rs1: grs1
+        funct3: b000
+        rd: grd
+        opcode: b01100
 
   - mnemonic: and
     rv32i: true
     synopsis: Bitwise AND
     operands: [grd, grs1, grs2]
+    encoding:
+      scheme: R
+      mapping:
+        funct7: b0000000
+        rs2: grs2
+        rs1: grs1
+        funct3: b111
+        rd: grd
+        opcode: b01100
 
   - mnemonic: andi
     rv32i: true
     synopsis: Bitwise AND with Immediate
     operands: [grd, grs1, imm]
+    encoding:
+      scheme: I
+      mapping:
+        imm: imm
+        rs1: grs1
+        funct3: b111
+        rd: grd
+        opcode: b00100
 
   - mnemonic: or
     rv32i: true
     synopsis: Bitwise OR
     operands: [grd, grs1, grs2]
+    encoding:
+      scheme: R
+      mapping:
+        funct7: b0000000
+        rs2: grs2
+        rs1: grs1
+        funct3: b110
+        rd: grd
+        opcode: b01100
 
   - mnemonic: ori
     rv32i: true
     synopsis: Bitwise OR with Immediate
     operands: [grd, grs1, imm]
+    encoding:
+      scheme: I
+      mapping:
+        imm: imm
+        rs1: grs1
+        funct3: b110
+        rd: grd
+        opcode: b00100
 
   - mnemonic: xor
     rv32i: true
     synopsis: Bitwise XOR
     operands: [grd, grs1, grs2]
+    encoding:
+      scheme: R
+      mapping:
+        funct7: b0000000
+        rs2: grs2
+        rs1: grs1
+        funct3: b100
+        rd: grd
+        opcode: b01100
 
   - mnemonic: xori
     rv32i: true
     synopsis: Bitwise XOR with Immediate
-    operands: [grd, grs1, grs2]
+    operands: [grd, grs, imm]
+    encoding:
+      scheme: I
+      mapping:
+        imm: imm
+        rs1: grs
+        funct3: b100
+        rd: grd
+        opcode: b00100
 
   - mnemonic: lw
     rv32i: true
     synopsis: Load Word
     operands: [grd, offset, grs1]
     syntax: <grd>, <offset>(<grs1>)
+    encoding:
+      scheme: I
+      mapping:
+        imm: offset
+        rs1: grs1
+        funct3: b010
+        rd: grd
+        opcode: b00000
 
   - mnemonic: sw
     rv32i: true
     synopsis: Store Word
     operands: [grs2, offset, grs1]
     syntax: <grs2>, <offset>(<grs1>)
+    encoding:
+      scheme: S
+      mapping:
+        imm: offset
+        rs2: grs2
+        rs1: grs1
+        funct3: b010
+        opcode: b01000
 
   - mnemonic: beq
     rv32i: true
     synopsis: Branch Equal
     operands: [grs1, grs2, offset]
+    encoding:
+      scheme: B
+      mapping:
+        imm: offset
+        rs2: grs2
+        rs1: grs1
+        funct3: b000
+        opcode: b11000
 
   - mnemonic: bne
     rv32i: true
     synopsis: Branch Not Equal
     operands: [grs1, grs2, offset]
+    encoding:
+      scheme: B
+      mapping:
+        imm: offset
+        rs2: grs2
+        rs1: grs1
+        funct3: b001
+        opcode: b11000
+
+  - mnemonic: jal
+    rv32i: true
+    synopsis: Jump And Link
+    operands: [grd, offset]
+    trailing-doc: |
+      Unlike in RV32I, the `x1` (return address) GPR is hard-wired to the call
+      stack. To call a subroutine use `jal x1, <offset>`.
+    encoding:
+      scheme: J
+      mapping:
+        imm: offset
+        rd: grd
+        opcode: b11011
+
+  - mnemonic: jalr
+    rv32i: true
+    synopsis: Jump And Link Register
+    operands: [grd, grs1, offset]
+    trailing-doc: |
+      Unlike in RV32I, the `x1` (return address) GPR is hard-wired to the call
+      stack. To return from a subroutine, use `jalr x0, x1, 0`.
+    encoding:
+      scheme: I
+      mapping:
+        imm: offset
+        rs1: grs1
+        funct3: b000
+        rd: grd
+        opcode: b11001
+
+  - mnemonic: csrrs
+    rv32i: true
+    synopsis: Atomic Read and Set bits in CSR
+    operands: [grd, csr, grs]
+    encoding:
+      scheme: I
+      mapping:
+        imm: csr
+        rs1: grs
+        funct3: b010
+        rd: grd
+        opcode: b11100
+
+  - mnemonic: csrrw
+    rv32i: true
+    synopsis: Atomic Read/Write CSR
+    operands: [grd, csr, grs]
+    encoding:
+      scheme: I
+      mapping:
+        imm: csr
+        rs1: grs
+        funct3: b001
+        rd: grd
+        opcode: b11100
+
+  - mnemonic: ecall
+    rv32i: true
+    synopsis: Environment Call
+    operands: []
+    doc: |
+      Triggers the `done` interrupt to indicate the completion of the
+      operation.
+    encoding:
+      scheme: I
+      mapping:
+        imm: b000000000000
+        rs1: b00000
+        funct3: b000
+        rd: b00000
+        opcode: b11100
 
   - mnemonic: loop
     synopsis: Loop (indirect)
@@ -217,6 +751,11 @@
         # loop body
       )
       ```
+    encoding:
+      scheme: loop
+      mapping:
+        bodysize: bodysize
+        grs: grs
 
   - mnemonic: loopi
     synopsis: Loop Immediate
@@ -239,40 +778,11 @@
         # loop body
       )
       ```
-
-  - mnemonic: jal
-    rv32i: true
-    synopsis: Jump And Link
-    operands: [grd, offset]
-    trailing-doc: |
-      Unlike in RV32I, the `x1` (return address) GPR is hard-wired to the call
-      stack. To call a subroutine use `jal x1, <offset>`.
-
-  - mnemonic: jalr
-    rv32i: true
-    synopsis: Jump And Link Register
-    operands: [grd, grs1, offset]
-    trailing-doc: |
-      Unlike in RV32I, the `x1` (return address) GPR is hard-wired to the call
-      stack. To return from a subroutine, use `jalr x0, x1, 0`.
-
-  - mnemonic: csrrs
-    rv32i: true
-    synopsis: Atomic Read and Set bits in CSR
-    operands: [grd, csr, grs]
-
-  - mnemonic: csrrw
-    rv32i: true
-    synopsis: Atomic Read/Write CSR
-    operands: [grd, csr, grs]
-
-  - mnemonic: ecall
-    rv32i: true
-    synopsis: Environment Call
-    operands: []
-    doc: |
-      Triggers the `done` interrupt to indicate the completion of the
-      operation.
+    encoding:
+      scheme: loopi
+      mapping:
+        bodysize: bodysize
+        iterations: iterations
 
   - mnemonic: bn.add
     group: bignum
@@ -318,6 +828,16 @@
 
       WDR[d] = result
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnaf
+      mapping:
+        fg: flag_group
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b000
+        wrd: wrd
 
   - mnemonic: bn.addc
     group: bignum
@@ -343,6 +863,16 @@
 
       WDR[d] = result
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnaf
+      mapping:
+        fg: flag_group
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b010
+        wrd: wrd
 
   - mnemonic: bn.addi
     group: bignum
@@ -371,6 +901,15 @@
 
       WDR[d] = result
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnai
+      mapping:
+        fg: flag_group
+        sub: b0
+        imm: imm
+        wrs: wrs
+        funct3: b100
+        wrd: wrd
 
   - mnemonic: bn.addm
     group: bignum
@@ -393,6 +932,14 @@
         result = result - MOD
 
       WDR[d] = result
+    encoding:
+      scheme: bnam
+      mapping:
+        sub: b0
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b101
+        wrd: wrd
 
   - mnemonic: bn.mulqacc
     group: bignum
@@ -472,6 +1019,18 @@
 
       elif writeback_variant == 'writeout':
         WDR[d] = ACC
+    encoding:
+      scheme: bnaq
+      mapping:
+        wb: b00
+        dh: bx
+        qs2: wrs2_qwsel
+        qs1: wrs1_qwsel
+        wrs2: wrs2
+        wrs1: wrs1
+        acc: acc_shift_imm
+        z: zero_acc
+        wrd: bxxxxx
 
   - mnemonic: bn.mulqacc.wo
     group: bignum
@@ -503,6 +1062,18 @@
       a_qwsel = DecodeQuarterWordSelect(wrs1_qwsel)
       b_qwsel = DecodeQuarterWordSelect(wrs2_qwsel)
     operation: *mulqacc-operation
+    encoding:
+      scheme: bnaq
+      mapping:
+        wb: b01
+        dh: bx
+        qs2: wrs2_qwsel
+        qs1: wrs1_qwsel
+        wrs2: wrs2
+        wrs1: wrs1
+        acc: acc_shift_imm
+        z: zero_acc
+        wrd: wrd
 
   - mnemonic: bn.mulqacc.so
     group: bignum
@@ -540,6 +1111,18 @@
       a_qwsel = DecodeQuarterWordSelect(wrs1_qwsel)
       b_qwsel = DecodeQuarterWordSelect(wrs2_qwsel)
     operation: *mulqacc-operation
+    encoding:
+      scheme: bnaq
+      mapping:
+        wb: b1x
+        dh: wrd_hwsel
+        qs2: wrs2_qwsel
+        qs1: wrs1_qwsel
+        wrs2: wrs2
+        wrs1: wrs1
+        acc: acc_shift_imm
+        z: zero_acc
+        wrd: wrd
 
   - mnemonic: bn.sub
     group: bignum
@@ -573,6 +1156,16 @@
 
       WDR[d] = result
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnaf
+      mapping:
+        fg: flag_group
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b001
+        wrd: wrd
 
   - mnemonic: bn.subb
     group: bignum
@@ -589,6 +1182,16 @@
 
       WDR[d] = result
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnaf
+      mapping:
+        fg: flag_group
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b011
+        wrd: wrd
 
   - mnemonic: bn.subi
     group: bignum
@@ -615,6 +1218,15 @@
 
       WDR[d] = result
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnai
+      mapping:
+        fg: flag_group
+        sub: b1
+        imm: imm
+        wrs: wrs
+        funct3: b100
+        wrd: wrd
 
   - mnemonic: bn.subm
     group: bignum
@@ -636,6 +1248,14 @@
         result = result - MOD
 
       WDR[d] = result
+    encoding:
+      scheme: bnam
+      mapping:
+        sub: b1
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b101
+        wrd: wrd
 
   - mnemonic: bn.and
     group: bignum
@@ -667,6 +1287,16 @@
       result = a & b_shifted
 
       WDR[d] = result
+    encoding:
+      scheme: bna
+      mapping:
+        funct31: b0
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b110
+        wrd: wrd
 
   - mnemonic: bn.or
     group: bignum
@@ -683,6 +1313,16 @@
       result = a | b_shifted
 
       WDR[d] = result
+    encoding:
+      scheme: bna
+      mapping:
+        funct31: b1
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b110
+        wrd: wrd
 
   - mnemonic: bn.not
     group: bignum
@@ -710,6 +1350,16 @@
       result = ~a_shifted
 
       WDR[d] = result
+    encoding:
+      scheme: bna
+      mapping:
+        funct31: b0
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs
+        wrs1: bxxxxx
+        funct3: b111
+        wrd: wrd
 
   - mnemonic: bn.xor
     group: bignum
@@ -726,6 +1376,16 @@
       result = a ^ b_shifted
 
       WDR[d] = result
+    encoding:
+      scheme: bnaf
+      mapping:
+        fg: b1
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b111
+        wrd: wrd
 
   - mnemonic: bn.rshi
     group: bignum
@@ -752,6 +1412,14 @@
       imm = Uint(imm)
     operation: |
       WDR[d] = ((rs1 | rs2) >> im)[WLEN-1:0]
+    encoding:
+      scheme: bnr
+      mapping:
+        imm: imm
+        wrs2: wrs2
+        wrs1: wrs1
+        funct2: b11
+        wrd: wrd
 
   - mnemonic: bn.sel
     group: bignum
@@ -786,6 +1454,14 @@
       flag_is_set = FLAGS[fg].get(flag)
 
       WDR[d] = wrs1 if flag_is_set else wrs2
+    encoding:
+      scheme: bns
+      mapping:
+        fg: flag_group
+        flag: flag
+        wrs2: wrs2
+        wrs1: wrs1
+        wrd: wrd
 
   - mnemonic: bn.cmp
     group: bignum
@@ -815,6 +1491,15 @@
       (, flags_out) = AddWithCarry(a, -b_shifted, "0")
 
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnc
+      mapping:
+        fg: flag_group
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b001
 
   - mnemonic: bn.cmpb
     group: bignum
@@ -829,6 +1514,15 @@
       (, flags_out) = AddWithCarry(a, -b, ~FLAGS[flag_group].C)
 
       FLAGS[flag_group] = flags_out
+    encoding:
+      scheme: bnc
+      mapping:
+        fg: flag_group
+        shift_type: shift_type
+        shift_bytes: shift_bytes
+        wrs2: wrs2
+        wrs1: wrs1
+        funct3: b011
 
   - mnemonic: bn.lid
     group: bignum
@@ -882,6 +1576,15 @@
           GPR[rs1] = GPR[rs1] + (WLEN / 8)
       if grd_inc:
           GPR[rd] = GPR[rd] + 1
+    encoding:
+      scheme: bnxid
+      mapping:
+        imm: offset
+        spp: grs1_inc
+        dpp: grd_inc
+        rs: grs1
+        funct3: b100
+        rd: grd
 
   - mnemonic: bn.sid
     group: bignum
@@ -933,6 +1636,15 @@
           GPR[rs1] = GPR[rs1] + (WLEN / 8)
       if grs2_inc:
           GPR[rs2] = GPR[rs2] + 1
+    encoding:
+      scheme: bnxid
+      mapping:
+        imm: offset
+        spp: grs1_inc
+        dpp: grs2_inc
+        rs: grs1
+        funct3: b101
+        rd: grs2
 
   - mnemonic: bn.mov
     group: bignum
@@ -942,6 +1654,14 @@
       s = UInt(wrs)
       d = UInt(wrd)
     operation: WDR[d] = WDR[s]
+    encoding:
+      scheme: bnmov
+      mapping:
+        indirect: b0
+        spp: bx
+        dpp: bx
+        src: wrs
+        dst: wrd
 
   - mnemonic: bn.movr
     group: bignum
@@ -976,13 +1696,35 @@
         GPR[s] = GPR[s] + 1
       if grd_inc:
         GPR[d] = GPR[d] + 1
+    encoding:
+      scheme: bnmov
+      mapping:
+        indirect: b1
+        spp: grs_inc
+        dpp: grd_inc
+        src: grs
+        dst: grd
 
   - mnemonic: bn.wsrrs
     group: bignum
     synopsis: Atomic Read and Set Bits in WSR
     operands: [wrd, wsr, wrs]
+    encoding:
+      scheme: wcsr
+      mapping:
+        write: b0
+        wcsr: wsr
+        wrs: wrs
+        wrd: wrd
 
   - mnemonic: bn.wsrrw
     group: bignum
     synopsis: Atomic Read/Write WSR
     operands: [wrd, wsr, wrs]
+    encoding:
+      scheme: wcsr
+      mapping:
+        write: b1
+        wcsr: wsr
+        wrs: wrs
+        wrd: wrd

diff --git a/hw/ip/otbn/util/insn_yaml.py b/hw/ip/otbn/util/insn_yaml.py
index 54eed7a..6b90cf0 100644
--- a/hw/ip/otbn/util/insn_yaml.py
+++ b/hw/ip/otbn/util/insn_yaml.py

@@ -4,9 +4,10 @@
 
 '''Support code for reading the instruction database in insns.yml'''
 
+import itertools
 import re
 from typing import (Callable, Dict, List, Optional,
-                    Sequence, Set, Tuple, TypeVar)
+                    Sequence, Set, Tuple, TypeVar, Union)
 
 import yaml
 
@@ -142,10 +143,455 @@
         return self.groups[0].key
 
 
+class BitRanges:
+    '''Represents the bit ranges used for a field in an encoding scheme'''
+    def __init__(self, as_string: str, what: str) -> None:
+        #   ranges ::= range
+        #            | range ',' ranges
+        #
+        #   range ::= num
+        #           | num ':' num
+        #
+        # Ranges are assumed to be msb:lsb (with msb >= lsb). Bit indices are
+        # at most 31 and ranges are disjoint.
+
+        if not as_string:
+            raise ValueError('Empty string as bits for {}'.format(what))
+
+        overlaps = 0
+
+        self.mask = 0
+        self.ranges = []
+        self.width = 0
+
+        for rng in as_string.split(','):
+            match = re.match(r'([0-9]+)(?:-([0-9]+))?$', rng)
+            if match is None:
+                raise ValueError('Range {!r} in bits for {} is malformed.'
+                                 .format(rng, what))
+
+            msb = int(match.group(1))
+            maybe_lsb = match.group(2)
+            lsb = msb if maybe_lsb is None else int(maybe_lsb)
+
+            if msb < lsb:
+                raise ValueError('Range {!r} in bits for {} has msb < lsb.'
+                                 .format(rng, what))
+
+            if msb >= 32:
+                raise ValueError('Range {!r} in bits for {} has msb >= 32.'
+                                 .format(rng, what))
+
+            rng_mask = (1 << (msb + 1)) - (1 << lsb)
+            overlaps |= rng_mask & self.mask
+            self.mask |= rng_mask
+
+            self.ranges.append((msb, lsb))
+            self.width += msb - lsb + 1
+
+        if overlaps:
+            raise ValueError('Bits for {} have overlapping ranges '
+                             '(mask: {:#08x})'
+                             .format(what, overlaps))
+
+
+class BoolLiteral:
+    '''Represents a boolean literal, with possible 'x characters'''
+    def __init__(self, as_string: str, what: str) -> None:
+        # We represent this as 2 masks: "ones" and "x". The ones mask is the
+        # bits that are marked 1. The x mask is the bits that are marked x.
+        # Then you can test whether a particular value matches the literal by
+        # zeroing all bits in the x mask and then comparing with the ones mask.
+        self.ones = 0
+        self.xs = 0
+        self.width = 0
+
+        # The literal should always start with a 'b'
+        if not as_string.startswith('b'):
+            raise ValueError("Boolean literal for {} doesn't start with a 'b'."
+                             .format(what))
+
+        for char in as_string[1:]:
+            if char == '_':
+                continue
+
+            self.ones <<= 1
+            self.xs <<= 1
+            self.width += 1
+
+            if char == '0':
+                continue
+            elif char == '1':
+                self.ones |= 1
+            elif char == 'x':
+                self.xs |= 1
+            else:
+                raise ValueError('Boolean literal for {} has '
+                                 'unsupported character: {!r}.'
+                                 .format(what, char))
+
+        if not self.width:
+            raise ValueError('Empty boolean literal for {}.'.format(what))
+
+    def char_for_bit(self, bit: int) -> str:
+        '''Return 0, 1 or x for the bit at the given position'''
+        assert bit < self.width
+        if (self.ones >> bit) & 1:
+            return '1'
+        if (self.xs >> bit) & 1:
+            return 'x'
+        return '0'
+
+
+class EncSchemeField:
+    '''Represents a single field in an encoding scheme'''
+    def __init__(self,
+                 bits: BitRanges,
+                 value: Optional[BoolLiteral],
+                 shift: int) -> None:
+        self.bits = bits
+        self.value = value
+        self.shift = shift
+
+    @staticmethod
+    def from_yaml(yml: object, what: str) -> 'EncSchemeField':
+        # This is either represented as a dict in the YAML or as a bare string.
+        bits_what = 'bits for {}'.format(what)
+        value_what = 'value for {}'.format(what)
+        shift_what = 'shift for {}'.format(what)
+
+        shift = 0
+
+        if isinstance(yml, dict):
+            yd = check_keys(yml, what, ['bits'], ['value', 'shift'])
+
+            bits_yml = yd['bits']
+            if not (isinstance(bits_yml, str) or isinstance(bits_yml, int)):
+                raise ValueError('{} is of type {}, not a string or int.'
+                                 .format(bits_what, type(bits_yml).__name__))
+
+            # We require value to be given as a string because it's supposed to
+            # be in base 2, and PyYAML will parse 111 as one-hundred and
+            # eleven, 011 as 9 and 0x11 as 17. Aargh!
+            raw_value = None
+            val_yml = yd.get('value')
+            if val_yml is not None:
+                if not isinstance(val_yml, str):
+                    raise ValueError("{} is of type {}, but must be a string "
+                                     "(we don't allow automatic conversion "
+                                     "because YAML's int conversion assumes "
+                                     "base 10 and value should be in base 2)."
+                                     .format(value_what,
+                                             type(val_yml).__name__))
+                raw_value = val_yml
+
+            # shift, on the other hand, is written in base 10. Allow an
+            # integer.
+            shift_yml = yd.get('shift')
+            if shift_yml is None:
+                pass
+            elif isinstance(shift_yml, str):
+                if not re.match(r'[0-9]+$', shift_yml):
+                    raise ValueError('{} is {!r} but should be a '
+                                     'non-negative integer.'
+                                     .format(shift_what, shift_yml))
+                shift = int(shift_yml)
+            elif isinstance(shift_yml, int):
+                if shift_yml < 0:
+                    raise ValueError('{} is {!r} but should be a '
+                                     'non-negative integer.'
+                                     .format(shift_what, shift_yml))
+                shift = shift_yml
+            else:
+                raise ValueError("{} is of type {}, but must be a string "
+                                 "or non-negative integer."
+                                 .format(shift_what, type(shift_yml).__name__))
+        elif isinstance(yml, str) or isinstance(yml, int):
+            bits_yml = yml
+            raw_value = None
+        else:
+            raise ValueError('{} is a {}, but should be a '
+                             'dict, string or integer.'
+                             .format(what, type(yml).__name__))
+
+        # The bits field is usually parsed as a string ("10-4", or similar).
+        # But if it's a bare integer then YAML will parse it as an int. That's
+        # fine, but we turn it back into a string to be re-parsed by BitRanges.
+        assert isinstance(bits_yml, str) or isinstance(bits_yml, int)
+
+        bits = BitRanges(str(bits_yml), bits_what)
+        value = None
+        if raw_value is not None:
+            value = BoolLiteral(raw_value, value_what)
+            if bits.width != value.width:
+                raise ValueError('{} has bits that imply a width of {}, but '
+                                 'a value with width {}.'
+                                 .format(what, bits.width, value.width))
+
+        return EncSchemeField(bits, value, shift)
+
+
+class EncSchemeImport:
+    '''An object representing inheritance of a parent scheme
+
+    When importing a parent scheme, we can set some of its fields with
+    immediate values. These are stored in the settings field.
+
+    '''
+    def __init__(self, yml: object, importer_name: str) -> None:
+        as_str = check_str(yml,
+                           'value for import in encoding scheme {!r}'
+                           .format(importer_name))
+
+        # The supported syntax is
+        #
+        #    - parent0(field0=b111, field1=b10)
+        #    - parent1()
+        #    - parent2
+
+        match = re.match(r'([^ (]+)[ ]*(?:\(([^)]+)\))?$', as_str)
+        if not match:
+            raise ValueError('Malformed encoding scheme '
+                             'inheritance by scheme {!r}: {!r}.'
+                             .format(importer_name, as_str))
+
+        self.parent = match.group(1)
+        self.settings = {}  # type: Dict[str, BoolLiteral]
+
+        when = ('When inheriting from {!r} in encoding scheme {!r}'
+                .format(self.parent, importer_name))
+
+        if match.group(2) is not None:
+            args = match.group(2).split(',')
+            for arg in args:
+                arg = arg.strip()
+                arg_parts = arg.split('=')
+                if len(arg_parts) != 2:
+                    raise ValueError('{}, found an argument with {} '
+                                     'equals signs (should have exactly one).'
+                                     .format(when, len(arg_parts) - 1))
+
+                field_name = arg_parts[0]
+                field_what = ('literal value for field {!r} when inheriting '
+                              'from {!r} in encoding scheme {!r}'
+                              .format(arg_parts[0], self.parent, importer_name))
+                field_value = BoolLiteral(arg_parts[1], field_what)
+
+                if field_name in self.settings:
+                    raise ValueError('{}, found multiple arguments assigning '
+                                     'values to the field {!r}.'
+                                     .format(when, field_name))
+
+                self.settings[field_name] = field_value
+
+    def apply_settings(self,
+                       esf: 'EncSchemeFields', what: str) -> 'EncSchemeFields':
+        # Copy and set values in anything that has a setting
+        fields = {}
+        for name, literal in self.settings.items():
+            old_field = esf.fields.get(name)
+            if old_field is None:
+                raise ValueError('{} sets unknown field {!r} from {!r}.'
+                                 .format(what, name, self.parent))
+
+            if old_field.bits.width != literal.width:
+                raise ValueError('{} sets field {!r} from {!r} with a literal '
+                                 'of width {}, but the field has width {}.'
+                                 .format(what, name, self.parent,
+                                         literal.width, old_field.bits.width))
+
+            fields[name] = EncSchemeField(old_field.bits,
+                                          literal,
+                                          old_field.shift)
+
+        # Copy anything else
+        op_fields = set()
+        for name, old_field in esf.fields.items():
+            if name in fields:
+                continue
+            op_fields.add(name)
+            fields[name] = old_field
+
+        return EncSchemeFields(fields, op_fields, esf.mask)
+
+
+class EncSchemeFields:
+    '''An object representing some fields in an encoding scheme'''
+    def __init__(self,
+                 fields: Dict[str, EncSchemeField],
+                 op_fields: Set[str],
+                 mask: int) -> None:
+        self.fields = fields
+        self.op_fields = op_fields
+        self.mask = mask
+
+    @staticmethod
+    def empty() -> 'EncSchemeFields':
+        return EncSchemeFields({}, set(), 0)
+
+    @staticmethod
+    def from_yaml(yml: object, name: str) -> 'EncSchemeFields':
+        if not isinstance(yml, dict):
+            raise ValueError('fields for encoding scheme {!r} should be a '
+                             'dict, but we saw a {}.'
+                             .format(name, type(yml).__name__))
+
+        fields = {}
+        op_fields = set()  # type: Set[str]
+        mask = 0
+
+        overlaps = 0
+
+        for key, val in yml.items():
+            if not isinstance(key, str):
+                raise ValueError('{!r} is a bad key for a field name of '
+                                 'encoding scheme {} (should be str, not {}).'
+                                 .format(key, name, type(key).__name__))
+
+            fld_what = 'field {!r} of encoding scheme {}'.format(key, name)
+            field = EncSchemeField.from_yaml(val, fld_what)
+
+            overlaps |= mask & field.bits.mask
+            mask |= field.bits.mask
+
+            fields[key] = field
+            if field.value is None:
+                op_fields.add(key)
+
+        if overlaps:
+            raise ValueError('Direct fields for encoding scheme {} have '
+                             'overlapping ranges (mask: {:#08x})'
+                             .format(name, overlaps))
+
+        return EncSchemeFields(fields, op_fields, mask)
+
+    def merge_in(self, right: 'EncSchemeFields', when: str) -> None:
+        for name, field in right.fields.items():
+            if name in self.fields:
+                raise ValueError('Duplicate field name: {!r} {}.'
+                                 .format(name, when))
+
+            overlap = self.mask & field.bits.mask
+            if overlap:
+                raise ValueError('Overlapping bit ranges '
+                                 '(masks: {:08x} and {:08x} have '
+                                 'intersection {:08x}) {}.'
+                                 .format(self.mask,
+                                         field.bits.mask, overlap, when))
+
+            self.fields[name] = field
+            self.mask |= field.bits.mask
+            if field.value is None:
+                assert name not in self.op_fields
+                self.op_fields.add(name)
+
+
+class EncScheme:
+    def __init__(self, yml: object, name: str) -> None:
+        what = 'encoding scheme {!r}'.format(name)
+        yd = check_keys(yml, what, [], ['parents', 'fields'])
+
+        if not yd:
+            raise ValueError('{} has no parents or fields.'.format(what))
+
+        fields_yml = yd.get('fields')
+        self.direct_fields = (EncSchemeFields.from_yaml(fields_yml, name)
+                              if fields_yml is not None
+                              else EncSchemeFields.empty())
+
+        parents_yml = yd.get('parents')
+        parents_what = 'parents of {}'.format(what)
+        parents = ([EncSchemeImport(y, name)
+                    for y in check_list(parents_yml, parents_what)]
+                   if parents_yml is not None
+                   else [])
+        self.parents = index_list(parents_what,
+                                  parents,
+                                  lambda imp: imp.parent)
+
+
+class EncSchemes:
+    def __init__(self, yml: object) -> None:
+        if not isinstance(yml, dict):
+            raise ValueError("value for encoding-schemes is expected to be "
+                             "a dict, but was actually a {}."
+                             .format(type(yml).__name__))
+
+        self.schemes = {}  # type: Dict[str, EncScheme]
+        self.resolved = {}  # type: Dict[str, EncSchemeFields]
+
+        for key, val in yml.items():
+            if not isinstance(key, str):
+                raise ValueError('{!r} is a bad key for an encoding scheme '
+                                 'name (should be str, not {}).'
+                                 .format(key, type(key).__name__))
+            self.schemes[key] = EncScheme(val, key)
+
+    def _resolve(self,
+                 name: str,
+                 user: str,
+                 stack: List[str]) -> EncSchemeFields:
+        # Have we resolved this before?
+        resolved = self.resolved.get(name)
+        if resolved is not None:
+            return resolved
+
+        # Spot any circular inheritance
+        if name in stack:
+            raise RuntimeError('Circular inheritance of encoding '
+                               'schemes: {}'
+                               .format(' -> '.join(stack + [name])))
+
+        # Does the scheme actually exist?
+        scheme = self.schemes.get(name)
+        if scheme is None:
+            raise ValueError('{} requires undefined encoding scheme {!r}.'
+                             .format(user, name))
+
+        # Recursively try to resolve each parent scheme, applying any import
+        # settings
+        resolved_parents = {}
+        new_stack = stack + [name]
+        what = 'Import list of encoding scheme {!r}'.format(name)
+        for pname, pimport in scheme.parents.items():
+            resolved = self._resolve(pimport.parent, what, new_stack)
+            resolved_parents[pname] = pimport.apply_settings(resolved, what)
+
+        # Now try to merge the resolved imports
+        merged = EncSchemeFields.empty()
+        parent_names_so_far = []  # type: List[str]
+        for pname, pfields in resolved_parents.items():
+            when = ('merging fields of scheme {} into '
+                    'already merged fields of {}'
+                    .format(pname, ', '.join(parent_names_so_far)))
+            merged.merge_in(pfields, when)
+            parent_names_so_far.append(repr(pname))
+
+        # Now try to merge in any direct fields
+        when = ('merging direct fields of scheme {} into fields from parents'
+                .format(name))
+        merged.merge_in(scheme.direct_fields, when)
+
+        return merged
+
+    def resolve(self, name: str, mnemonic: str) -> EncSchemeFields:
+        fields = self._resolve(name, 'Instruction {!r}'.format(mnemonic), [])
+
+        # Check completeness
+        missing = ((1 << 32) - 1) & ~fields.mask
+        if missing:
+            raise ValueError('Fields for encoding scheme {} miss some bits '
+                             '(mask: {:#08x})'
+                             .format(name, missing))
+
+        return fields
+
+
 class OperandType:
     '''The base class for some sort of operand type'''
-    def __init__(self) -> None:
-        pass
+    def __init__(self, width: Optional[int]) -> None:
+        assert width is None or width > 0
+        self.width = width
 
     def markdown_doc(self) -> Optional[str]:
         '''Generate any (markdown) documentation for this operand type
@@ -159,21 +605,19 @@
 
 class RegOperandType(OperandType):
     '''A class representing a register operand type'''
-    TYPES = ['gpr', 'wdr', 'csr', 'wsr']
+    TYPE_WIDTHS = {'gpr': 5, 'wdr': 5, 'csr': 12, 'wsr': 8}
 
     def __init__(self, reg_type: str, is_dest: bool):
-        assert reg_type in RegOperandType.TYPES
+        type_width = RegOperandType.TYPE_WIDTHS.get(reg_type)
+        assert type_width is not None
+        super().__init__(type_width)
+
         self.reg_type = reg_type
         self.is_dest = is_dest
 
 
 class ImmOperandType(OperandType):
     '''A class representing an immediate operand type'''
-    def __init__(self, width: Optional[int]):
-        if width is not None:
-            assert width > 0
-        self.width = width
-
     def markdown_doc(self) -> Optional[str]:
         # Override from OperandType base class
         if self.width is None:
@@ -343,13 +787,150 @@
         return ''.join(parts)
 
 
+class EncodingField:
+    '''A single element of an encoding's mapping'''
+    def __init__(self,
+                 value: Union[BoolLiteral, str],
+                 scheme_field: EncSchemeField) -> None:
+        self.value = value
+        self.scheme_field = scheme_field
+
+    @staticmethod
+    def from_yaml(as_str: str,
+                  scheme_field: EncSchemeField,
+                  name_to_operand: Dict[str, Operand],
+                  what: str) -> 'EncodingField':
+        # The value should either be a boolean literal ("000xx11" or similar)
+        # or should be a name, which is taken as the name of an operand.
+        if not as_str:
+            raise ValueError('Empty string as {}.'.format(what))
+
+        # Set self.value to be either the bool literal or the name of the
+        # operand.
+        value_width = None
+        value = ''  # type: Union[BoolLiteral, str]
+        if re.match(r'b[01x_]+$', as_str):
+            value = BoolLiteral(as_str, what)
+            value_width = value.width
+            value_type = 'a literal value'
+        else:
+            operand = name_to_operand.get(as_str)
+            if operand is None:
+                raise ValueError('Unknown operand, {!r}, as {}'
+                                 .format(as_str, what))
+            value_width = operand.op_type.width
+            value = as_str
+            value_type = 'an operand'
+
+        # Unless we had an operand of type 'imm' (unknown width), we now have
+        # an expected width. Check it matches the width of the schema field.
+        if value_width is not None:
+            if scheme_field.bits.width != value_width:
+                raise ValueError('{} is mapped to {} with width {}, but the '
+                                 'encoding schema field has width {}.'
+                                 .format(what, value_type, value_width,
+                                         scheme_field.bits.width))
+
+        # Track the scheme field as well (so we don't have to keep track of a
+        # scheme once we've made an encoding object)
+        return EncodingField(value, scheme_field)
+
+
+class Encoding:
+    '''The encoding for an instruction'''
+    def __init__(self,
+                 yml: object,
+                 schemes: EncSchemes,
+                 name_to_operand: Dict[str, Operand],
+                 mnemonic: str):
+        what = 'encoding for instruction {!r}'.format(mnemonic)
+        yd = check_keys(yml, what, ['scheme', 'mapping'], [])
+
+        scheme_what = 'encoding scheme for instruction {!r}'.format(mnemonic)
+        scheme_name = check_str(yd['scheme'], scheme_what)
+        scheme_fields = schemes.resolve(scheme_name, mnemonic)
+
+        what = 'encoding mapping for instruction {!r}'.format(mnemonic)
+
+        # Check we've got exactly the right fields for the scheme
+        ydm = check_keys(yd['mapping'], what, list(scheme_fields.op_fields), [])
+
+        # Track the set of operand names that were used in some field
+        operands_used = set()
+
+        self.fields = {}
+        for field_name, scheme_field in scheme_fields.fields.items():
+            if scheme_field.value is not None:
+                field = EncodingField(scheme_field.value, scheme_field)
+            else:
+                field_what = ('value for {} field in encoding for instruction {!r}'
+                              .format(field_name, mnemonic))
+                field = EncodingField.from_yaml(check_str(ydm[field_name], field_what),
+                                                scheme_fields.fields[field_name],
+                                                name_to_operand,
+                                                field_what)
+
+                # If the field's value is an operand rather than a literal, it
+                # will have type str. Track the operands that we've used.
+                if isinstance(field.value, str):
+                    operands_used.add(field.value)
+
+            self.fields[field_name] = field
+
+        # We know that every field in the encoding scheme has a value. But we
+        # still need to check that every operand ended up in some field.
+        assert operands_used <= set(name_to_operand.keys())
+        unused_ops = set(name_to_operand.keys()) - operands_used
+        if unused_ops:
+            raise ValueError('Not all operands used in {} (missing: {}).'
+                             .format(what, ', '.join(list(unused_ops))))
+
+    def get_masks(self) -> Tuple[int, int]:
+        '''Return zeros/ones masks for encoding
+
+        Returns a pair (m0, m1) where m0 is the "zeros mask": a mask where a
+        bit is set if there is an bit pattern matching this encoding with that
+        bit zero. m1 is the ones mask: equivalent, but for that bit one.
+
+        '''
+        m0 = 0
+        m1 = 0
+        for field_name, field in self.fields.items():
+            if isinstance(field.value, str):
+                m0 |= field.scheme_field.bits.mask
+                m1 |= field.scheme_field.bits.mask
+            else:
+                # Match up the bits in the value with the ranges in the scheme.
+                assert field.value.width > 0
+                assert field.value.width == field.scheme_field.bits.width
+                bits_seen = 0
+                for msb, lsb in field.scheme_field.bits.ranges:
+                    val_msb = field.scheme_field.bits.width - 1 - bits_seen
+                    val_lsb = val_msb - msb + lsb
+                    bits_seen += msb - lsb + 1
+
+                    for idx in range(0, msb - lsb + 1):
+                        desc = field.value.char_for_bit(val_lsb + idx)
+                        if desc in ['0', 'x']:
+                            m0 |= 1 << (idx + lsb)
+                        if desc in ['1', 'x']:
+                            m1 |= 1 << (idx + lsb)
+
+        all_bits = (1 << 32) - 1
+        assert (m0 | m1) == all_bits
+        return (m0, m1)
+
+
 class Insn:
-    def __init__(self, yml: object, groups: InsnGroups) -> None:
+    def __init__(self,
+                 yml: object,
+                 groups: InsnGroups,
+                 encoding_schemes: EncSchemes) -> None:
         yd = check_keys(yml, 'instruction',
                         ['mnemonic', 'operands'],
                         ['group', 'rv32i', 'synopsis',
                          'syntax', 'doc', 'note', 'trailing-doc',
-                         'decode', 'operation'])
+                         'decode', 'operation', 'encoding'])
 
         self.mnemonic = check_str(yd['mnemonic'], 'mnemonic for instruction')
 
@@ -393,19 +974,68 @@
                                          list(sorted(self.syntax.operands)),
                                          list(sorted(self.name_to_operand))))
 
+        encoding_yml = yd.get('encoding')
+        self.encoding = None
+        if encoding_yml is not None:
+            self.encoding = Encoding(encoding_yml, encoding_schemes,
+                                     self.name_to_operand, self.mnemonic)
+
+
+def find_ambiguous_encodings(insns: List[Insn]) -> List[Tuple[str, str, int]]:
+    '''Check for ambiguous instruction encodings
+
+    Returns a list of ambiguous pairs (mnemonic0, mnemonic1, bits) where
+    bits is a bit pattern that would match either instruction.
+
+    '''
+    masks = {}
+    for insn in insns:
+        if insn.encoding is not None:
+            masks[insn.mnemonic] = insn.encoding.get_masks()
+
+    ret = []
+    for mnem0, mnem1 in itertools.combinations(masks.keys(), 2):
+        m00, m01 = masks[mnem0]
+        m10, m11 = masks[mnem1]
+
+        # The pair of instructions is ambiguous if a bit pattern might be
+        # either instruction. That happens if each bit index is either
+        # allowed to be a 0 in both or allowed to be a 1 in both.
+        # ambiguous_mask is the set of bits that don't distinguish the
+        # instructions from each other.
+        m0 = m00 & m10
+        m1 = m01 & m11
+
+        ambiguous_mask = m0 | m1
+        if ambiguous_mask == (1 << 32) - 1:
+            ret.append((mnem0, mnem1, m1 & ~m0))
+
+    return ret
+
 
 class InsnsFile:
     def __init__(self, yml: object) -> None:
         yd = check_keys(yml, 'top-level',
-                        ['insn-groups', 'insns'],
+                        ['insn-groups', 'encoding-schemes', 'insns'],
                         [])
 
         self.groups = InsnGroups(yd['insn-groups'])
-        self.insns = [Insn(i, self.groups)
+        self.encoding_schemes = EncSchemes(yd['encoding-schemes'])
+        self.insns = [Insn(i, self.groups, self.encoding_schemes)
                       for i in check_list(yd['insns'], 'insns')]
         self.mnemonic_to_insn = index_list('insns', self.insns,
                                            lambda insn: insn.mnemonic)
 
+        ambiguous_encodings = find_ambiguous_encodings(self.insns)
+        if ambiguous_encodings:
+            ambiguity_msgs = []
+            for mnem0, mnem1, bits in ambiguous_encodings:
+                ambiguity_msgs.append('{!r} and {!r} '
+                                      'both match bit pattern {:#010x}'
+                                      .format(mnem0, mnem1, bits))
+            raise ValueError('Ambiguous instruction encodings: ' +
+                             ', '.join(ambiguity_msgs))
+
     def grouped_insns(self) -> List[Tuple[InsnGroup, List[Insn]]]:
         '''Return the instructions in groups'''
         grp_to_insns = {}  # type: Dict[str, List[Insn]]

diff --git a/hw/ip/otbn/util/yaml_to_doc.py b/hw/ip/otbn/util/yaml_to_doc.py
index 3fe5898..b112025 100755
--- a/hw/ip/otbn/util/yaml_to_doc.py
+++ b/hw/ip/otbn/util/yaml_to_doc.py

@@ -8,7 +8,8 @@
 import argparse
 import sys
 
-from insn_yaml import Insn, InsnsFile, Operand, load_file
+from insn_yaml import (BoolLiteral, Encoding, Insn, InsnsFile, Operand,
+                       load_file)
 
 
 def render_operand_row(operand: Operand) -> str:
@@ -57,6 +58,82 @@
     return ''.join(parts)
 
 
+def render_encoding(mnemonic: str, encoding: Encoding) -> str:
+    '''Generate a table displaying an instruction encoding'''
+    parts = []
+    parts.append('<table style="font-size: 75%">')
+    parts.append('<tr>')
+    parts.append('<td></td>')
+    for bit in range(31, -1, -1):
+        parts.append('<td>{}</td>'.format(bit))
+    parts.append('</tr>')
+
+    # Build dictionary of bit ranges, keyed by the msb and with value a pair
+    # (width, desc) where width is the width of the range in bits and desc is a
+    # string describing what is stored in the range.
+    by_msb = {}
+
+    for field_name, field in encoding.fields.items():
+        scheme_field = field.scheme_field
+        # If this field is a literal value, explode it into single bits. To do
+        # so, we walk the ranges and match up with ranges in the value.
+        if isinstance(field.value, BoolLiteral):
+            assert field.value.width > 0
+            assert field.value.width == scheme_field.bits.width
+            bits_seen = 0
+            for msb, lsb in scheme_field.bits.ranges:
+                val_msb = scheme_field.bits.width - 1 - bits_seen
+                val_lsb = val_msb - msb + lsb
+                bits_seen += msb - lsb + 1
+
+                for idx in range(0, msb - lsb + 1):
+                    desc = field.value.char_for_bit(val_lsb + idx)
+                    by_msb[lsb + idx] = (1, '' if desc == 'x' else desc)
+            continue
+
+        # Otherwise this field's value is an operand name
+        assert isinstance(field.value, str)
+        operand_name = field.value
+
+        # If there is only one range (and no shifting), that's easy.
+        if len(scheme_field.bits.ranges) == 1 and scheme_field.shift == 0:
+            msb, lsb = scheme_field.bits.ranges[0]
+            by_msb[msb] = (msb - lsb + 1, operand_name)
+            continue
+
+        # Otherwise, we have to split up the operand into things like "foo[8:5]"
+        bits_seen = 0
+        for msb, lsb in scheme_field.bits.ranges:
+            val_msb = scheme_field.shift + scheme_field.bits.width - 1 - bits_seen
+            val_lsb = val_msb - msb + lsb
+            bits_seen += msb - lsb + 1
+            if msb == lsb:
+                desc = '{}[{}]'.format(operand_name, val_msb)
+            else:
+                desc = '{}[{}:{}]'.format(operand_name, val_msb, val_lsb)
+            by_msb[msb] = (msb - lsb + 1, desc)
+
+    parts.append('<tr>')
+    parts.append('<td>{}</td>'.format(mnemonic.upper()))
+
+    # Now run down the ranges in descending order of msb to get the table cells
+    next_bit = 31
+    for msb in sorted(by_msb.keys(), reverse=True):
+        # Sanity check to make sure we have a dense table
+        assert msb == next_bit
+
+        width, desc = by_msb[msb]
+        next_bit = msb - width
+
+        parts.append('<td colspan="{}">{}</td>'.format(width, desc))
+
+    assert next_bit == -1
+    parts.append('</tr>')
+
+    parts.append('</table>\n\n')
+    return ''.join(parts)
+
+
 def render_insn(insn: Insn, heading_level: int) -> str:
     '''Generate the documentation for an instruction
 
@@ -115,6 +192,10 @@
     if any(op.doc is not None for op in insn.operands):
         parts.append(render_operand_table(insn))
 
+    # Show encoding if we have one
+    if insn.encoding is not None:
+        parts.append(render_encoding(insn.mnemonic, insn.encoding))
+
     # Show decode pseudo-code if given
     if insn.decode is not None:
         parts.append('{} Decode\n\n'
commit	56a2ac54ccaa5b679f10c555925558114e05ca78	[log] [tgz]
author	Rupert Swarbrick <rswarbrick@lowrisc.org>	Tue Jul 14 09:57:04 2020 +0100
committer	Rupert Swarbrick <rswarbrick@gmail.com>	Thu Jul 16 16:05:30 2020 +0100
tree	6262730f3fcf6e79990b417518db4531bd369800
parent	6302a55d979f296314309561dc490c56b961f529 [diff]