feat(debug): Transition debug module to CSR-based interface
This commit refactors the debug module's external interface, moving from direct pin-level connections to a memory-mapped CSR (Control and Status Register) interface accessible via the AXI bus. This change provides a more structured and extensible mechanism for controlling and interacting with the debug module.
Key changes include:
- **Hardware:**
- The `CoreAxi` top-level module is updated to connect the debug module to the new CSR interface, removing the direct `dm` port.
- `CoreAxiCSR` is enhanced with a set of registers for debug requests and responses, including registers for address, data, operation, and status.
- The debug module's request and response signals are now driven by these CSRs, enabling control via AXI writes and reads.
- **Testbench:**
- The `CoreMiniAxiInterface` is significantly updated to use the new CSR-based communication protocol.
- The `dm_req_agent` and `dm_rsp_agent`, which previously managed the pin-level protocol, have been removed.
- The `dm_read` and `dm_write` functions are rewritten to perform the multi-step process of writing to the request CSRs, polling the status CSR, and reading the response CSRs.
- All debug-related tests are updated to use the new `dm_read` and `dm_write` functions.
- **Documentation:**
- The debug module documentation is updated to describe the new CSR-based command protocol.
- A table of the new AXI CSRs is added, including their addresses and descriptions.
- The command examples are updated to reflect the new multi-step process for reading and writing GPRs via the CSR interface.
This refactoring simplifies the hardware interface, improves the robustness of the testbench, and provides a clearer and more comprehensive programming model for external debuggers.
Change-Id: I39be61d805d0e1236550a33a40dbb20f91da0d67
diff --git a/doc/microarch/debug.md b/doc/microarch/debug.md
index d70f86f..80d92e3 100644
--- a/doc/microarch/debug.md
+++ b/doc/microarch/debug.md
@@ -14,7 +14,7 @@
## Interfaces
-The following table describes the inputs and outputs of the debug module:
+The following table describes the internal hardware interfaces of the debug module:
| Name | Direction | Type | Width | Description |
|----------------|-----------|-----------------------|-------|-------------------------------------------|
@@ -61,19 +61,48 @@
## Command Protocol
-An external debugger communicates with the debug module by reading and writing its internal registers. The `ext.req` and `ext.rsp` interfaces are used for this purpose.
+An external debugger communicates with the debug module by reading and writing a set of memory-mapped Control and Status Registers (CSRs) over the AXI interface. These CSRs provide a communication channel to the debug module's internal registers.
-To issue a command, the debugger sends a request on the `ext.req` interface. The `address` field specifies the register to access, and the `op` field specifies the operation (read or write). For write operations, the `data` field contains the value to write.
+### Write Operation
-The debug module responds on the `ext.rsp` interface. The `op` field indicates the status of the operation, and for read operations, the `data` field contains the value read from the register.
+To write to an internal debug module register (e.g., writing `0x1` to `dmcontrol` at address `0x10`):
-All debug operations, including halting the core, resuming the core, and executing abstract commands, are performed by reading and writing the debug module's registers via this protocol.
+1. **Poll for readiness:** Read the `status` CSR at `0x31014` and wait for bit 0 to be `1`.
+2. **Set address:** Write the target internal register address (`0x10`) to the `req_addr` CSR at `0x31000`.
+3. **Set data:** Write the data (`0x1`) to the `req_data` CSR at `0x31004`.
+4. **Initiate write:** Write the `WRITE` operation code (`2`) to the `req_op` CSR at `0x31008`.
+5. **Poll for response:** Read the `status` CSR at `0x31014` and wait for bit 1 to be `1`.
+6. **Check status:** Read the `rsp_op` CSR at `0x31010` to confirm the operation was successful.
+7. **Acknowledge response:** Write to the `status` CSR at `0x31014` to clear the response.
-## Registers
+### Read Operation
-The debug module implements a set of registers that are accessible to an external debugger. These registers are used to control and monitor the core.
+To read from an internal debug module register (e.g., reading from `dmstatus` at address `0x11`):
-The following table lists the debug module registers:
+1. **Poll for readiness:** Read the `status` CSR at `0x31014` and wait for bit 0 to be `1`.
+2. **Set address:** Write the target internal register address (`0x11`) to the `req_addr` CSR at `0x31000`.
+3. **Initiate read:** Write the `READ` operation code (`1`) to the `req_op` CSR at `0x31008`.
+4. **Poll for response:** Read the `status` CSR at `0x31014` and wait for bit 1 to be `1`.
+5. **Check status:** Read the `rsp_op` CSR at `0x31010` to confirm the operation was successful.
+6. **Read data:** Read the result from the `rsp_data` CSR at `0x3100c`.
+7. **Acknowledge response:** Write to the `status` CSR at `0x31014` to clear the response.
+
+## AXI CSR Interface
+
+These registers are mapped into the Kelvin CSR address space and are used to communicate with the debug module.
+
+| Address | Name | Description |
+|------------|------------|------------------------------------------------------------------------------------------------------------|
+| 0x31000 | req_addr | Write the target debug module register address here. |
+| 0x31004 | req_data | Write data for the debug module operation here. |
+| 0x31008 | req_op | Write the operation type (e.g., READ, WRITE) to this register to initiate a debug module command. |
+| 0x3100c | rsp_data | After a command completes, the data result is available here. |
+| 0x31010 | rsp_op | After a command completes, the status result (e.g., SUCCESS, FAILED) is available here. |
+| 0x31014 | status | A read-only register to check the status of the debug module. Bit 0 indicates if the module is ready for a new request. Bit 1 indicates if a response is available. |
+
+## Internal Debug Module Registers
+
+The debug module implements a set of internal registers that are accessible to an external debugger via the AXI CSR interface. These registers are used to control and monitor the core.
| Address | Name | Description |
|---------|--------------|-------------------------------------------|
@@ -172,7 +201,4 @@
* `write` (bit 16): `1` (write)
* `regno` (bits 15:0): `0x100A` (for `a0`)
3. **Wait for completion:** Poll the `abstractcs` register (address `0x16`) until the `busy` bit (bit 12) is cleared.
-4. **Check for errors:** Read the `abstractcs` register again and check that the `cmderr` field (bits 10:8) is `0`.
-
-
-
+4. **Check for errors:** Read the `abstractcs` register again and check that the `cmderr` field (bits 10:8) is `0`.
\ No newline at end of file
diff --git a/hdl/chisel/src/kelvin/CoreAxi.scala b/hdl/chisel/src/kelvin/CoreAxi.scala
index 8bac00a..9de9466 100644
--- a/hdl/chisel/src/kelvin/CoreAxi.scala
+++ b/hdl/chisel/src/kelvin/CoreAxi.scala
@@ -40,9 +40,6 @@
// String logging interface
val slog = new SLogIO(p)
val te = Input(Bool())
-
- // DM-IF
- val dm = Option.when(p.useDebugModule)(new DebugModuleIO(p))
})
dontTouch(io)
@@ -68,8 +65,8 @@
dontTouch(dm.get.io)
val dmEnable = RegInit(false.B)
dmEnable := true.B
- dm.get.io.ext.req <> GateDecoupled(io.dm.get.req, dmEnable)
- io.dm.get.rsp <> GateDecoupled(dm.get.io.ext.rsp, dmEnable)
+ dm.get.io.ext.req <> GateDecoupled(csr.io.debug.get.req, dmEnable)
+ csr.io.debug.get.rsp <> GateDecoupled(dm.get.io.ext.rsp, dmEnable)
}
val core_reset = Mux(io.te, (!io.aresetn.asBool).asAsyncReset, (csr.io.reset || dm.map(_.io.ndmreset).getOrElse(false.B)).asAsyncReset)
diff --git a/hdl/chisel/src/kelvin/CoreAxiCSR.scala b/hdl/chisel/src/kelvin/CoreAxiCSR.scala
index 7c2447c..3aac4a8 100644
--- a/hdl/chisel/src/kelvin/CoreAxiCSR.scala
+++ b/hdl/chisel/src/kelvin/CoreAxiCSR.scala
@@ -19,6 +19,15 @@
import bus.AxiMasterIO
+object CoreCsrAddrs {
+ val DbgReqAddr = 0x1000.U
+ val DbgReqData = 0x1004.U
+ val DbgReqOp = 0x1008.U
+ val DbgRspData = 0x100c.U
+ val DbgRspOp = 0x1010.U
+ val DbgStatus = 0x1014.U
+}
+
class CoreCSR(p: Parameters) extends Module {
val io = IO(new Bundle {
val fabric = Flipped(new FabricIO(p))
@@ -31,6 +40,7 @@
val halted = Input(Bool())
val fault = Input(Bool())
val kelvin_csr = Input(new CsrOutIO(p))
+ val debug = Option.when(p.useDebugModule)(Flipped(new DebugModuleIO(p)))
})
// Bit 0 - Reset (Active High)
@@ -39,21 +49,79 @@
val resetReg = RegInit(3.U(p.fetchAddrBits.W))
val pcStartReg = RegInit(0.U(p.fetchAddrBits.W))
val statusReg = RegInit(0.U(p.fetchAddrBits.W))
+ val debugReqAddrReg = Option.when(p.useDebugModule)(RegInit(0.U(32.W)))
+ val debugReqDataReg = Option.when(p.useDebugModule)(RegInit(0.U(32.W)))
+ val debugReqOpReg = Option.when(p.useDebugModule)(RegInit(DmReqOp.NOP.asUInt))
+
+ val writeEn = io.fabric.writeDataAddr.valid && !io.internal
+ val writeAddr = io.fabric.writeDataAddr.bits
+ val writeData = io.fabric.writeDataBits
+
+ val rsp_queue = if (p.useDebugModule) {
+ val queue = Module(new Queue(new DebugModuleRspIO(p), 1))
+ queue.io.enq <> io.debug.get.rsp
+
+ val req_valid_pulse = RegInit(false.B)
+ val write_to_op_reg = writeEn && writeAddr === CoreCsrAddrs.DbgReqOp
+ req_valid_pulse := Mux(write_to_op_reg && io.debug.get.req.ready, true.B, false.B)
+ io.debug.get.req.valid := req_valid_pulse
+
+ io.debug.get.req.bits.address := debugReqAddrReg.get
+ io.debug.get.req.bits.data := debugReqDataReg.get
+ val (req_op, req_op_valid) = DmReqOp.safe(debugReqOpReg.get)
+ io.debug.get.req.bits.op := Mux(req_op_valid, req_op, DmReqOp.NOP)
+
+ val write_to_status_reg = writeEn && writeAddr === CoreCsrAddrs.DbgStatus
+ queue.io.deq.ready := write_to_status_reg
+ Some(queue)
+ } else {
+ None
+ }
+
+ val debugReadMap = if (p.useDebugModule) {
+ val debugStatusReg = Cat(rsp_queue.get.io.deq.valid, io.debug.get.req.ready)
+ Seq(
+ CoreCsrAddrs.DbgReqAddr -> Cat(0.U(96.W), debugReqAddrReg.get),
+ CoreCsrAddrs.DbgReqData -> Cat(0.U(64.W), debugReqDataReg.get, 0.U(32.W)),
+ CoreCsrAddrs.DbgReqOp -> Cat(0.U(32.W), debugReqOpReg.get, 0.U(64.W)),
+ CoreCsrAddrs.DbgRspData -> Cat(rsp_queue.get.io.deq.bits.data, 0.U(96.W)),
+ CoreCsrAddrs.DbgRspOp -> Cat(0.U(96.W), rsp_queue.get.io.deq.bits.op.asUInt),
+ CoreCsrAddrs.DbgStatus -> Cat(0.U(64.W), debugStatusReg, 0.U(32.W)),
+ )
+ } else {
+ Seq()
+ }
val readData =
MuxLookup(io.fabric.readDataAddr.bits, 0.U)(Seq(
0x0.U -> Cat(0.U(96.W), resetReg),
0x4.U -> Cat(0.U(64.W), pcStartReg, 0.U(32.W)),
0x8.U -> Cat(0.U(32.W), statusReg, 0.U(64.W)),
- ) ++ ((0 until p.csrOutCount).map(
+ ) ++ debugReadMap
+ ++ ((0 until p.csrOutCount).map(
x => ((0x100 + 4*x).U -> (io.kelvin_csr.value(x) << (32 * (x % 4)).U))
)))
+
+ val debugReadValidMap = if (p.useDebugModule) {
+ Seq(
+ CoreCsrAddrs.DbgReqAddr -> true.B,
+ CoreCsrAddrs.DbgReqData -> true.B,
+ CoreCsrAddrs.DbgReqOp -> true.B,
+ CoreCsrAddrs.DbgRspData -> true.B,
+ CoreCsrAddrs.DbgRspOp -> true.B,
+ CoreCsrAddrs.DbgStatus -> true.B,
+ )
+ } else {
+ Seq()
+ }
+
val readDataValid =
MuxLookup(io.fabric.readDataAddr.bits, false.B)(Seq(
0x0.U -> true.B,
0x4.U -> true.B,
0x8.U -> true.B,
- ) ++ ((0 until p.csrOutCount).map(x => ((0x100 + 4*x).U -> true.B))))
+ ) ++ debugReadValidMap
+ ++ ((0 until p.csrOutCount).map(x => ((0x100 + 4*x).U -> true.B))))
// Delay reads by one cycle
val readDataNext = Pipe(readDataValid, readData, 1)
@@ -64,13 +132,30 @@
io.pcStart := pcStartReg
statusReg := Cat(io.fault, io.halted)
- // TODO(atv): What bits are allowed to change in these? Add a mask or something.
- resetReg := Mux(io.fabric.writeDataAddr.valid && io.fabric.writeDataAddr.bits === 0x0.U && !io.internal, io.fabric.writeDataBits(31,0), resetReg)
- pcStartReg := Mux(io.fabric.writeDataAddr.valid && io.fabric.writeDataAddr.bits === 0x4.U && !io.internal, io.fabric.writeDataBits(63,32), pcStartReg)
- io.fabric.writeResp := io.fabric.writeDataAddr.valid && MuxLookup(io.fabric.writeDataAddr.bits, false.B)(Seq(
+ // Register writes
+ resetReg := Mux(writeEn && writeAddr === 0x0.U, writeData(31,0), resetReg)
+ pcStartReg := Mux(writeEn && writeAddr === 0x4.U, writeData(63,32), pcStartReg)
+ if (p.useDebugModule) {
+ debugReqAddrReg.get := Mux(writeEn && writeAddr === CoreCsrAddrs.DbgReqAddr, writeData(31,0), debugReqAddrReg.get)
+ debugReqDataReg.get := Mux(writeEn && writeAddr === CoreCsrAddrs.DbgReqData, writeData(63,32), debugReqDataReg.get)
+ debugReqOpReg.get := Mux(writeEn && writeAddr === CoreCsrAddrs.DbgReqOp, writeData(95,64), debugReqOpReg.get)
+ }
+
+ val debugWriteValidMap = if (p.useDebugModule) {
+ Seq(
+ CoreCsrAddrs.DbgReqAddr -> true.B,
+ CoreCsrAddrs.DbgReqData -> true.B,
+ CoreCsrAddrs.DbgReqOp -> true.B,
+ CoreCsrAddrs.DbgStatus -> true.B,
+ )
+ } else {
+ Seq()
+ }
+
+ io.fabric.writeResp := writeEn && MuxLookup(writeAddr, false.B)(Seq(
0x0.U -> true.B,
0x4.U -> true.B,
- ))
+ ) ++ debugWriteValidMap)
}
class CoreAxiCSR(p: Parameters,
@@ -87,6 +172,7 @@
val halted = Input(Bool())
val fault = Input(Bool())
val kelvin_csr = Input(new CsrOutIO(p))
+ val debug = Option.when(p.useDebugModule)(Flipped(new DebugModuleIO(p)))
})
val axi = Module(new AxiSlave(p))
@@ -107,4 +193,7 @@
csr.io.halted := io.halted
csr.io.fault := io.fault
csr.io.kelvin_csr := io.kelvin_csr
+ if (p.useDebugModule) {
+ io.debug.get <> csr.io.debug.get
+ }
}
diff --git a/kelvin_test_utils/core_mini_axi_interface.py b/kelvin_test_utils/core_mini_axi_interface.py
index ab184d5..2c9ebdc 100644
--- a/kelvin_test_utils/core_mini_axi_interface.py
+++ b/kelvin_test_utils/core_mini_axi_interface.py
@@ -122,13 +122,6 @@
self.slave_wfifo = Queue()
self.slave_bfifo = Queue()
- try:
- self.debug_available = (self.dut.io_dm_req_valid != None)
- self.dm_req_fifo = Queue()
- self.dm_rsp_fifo = Queue()
- except AttributeError as e:
- self.debug_available = False
-
async def init(self):
cocotb.start_soon(self.master_awagent())
cocotb.start_soon(self.master_wagent())
@@ -142,9 +135,13 @@
cocotb.start_soon(self.slave_ragent())
cocotb.start_soon(self.memory_write_agent())
cocotb.start_soon(self.memory_read_agent())
- if self.debug_available:
- cocotb.start_soon(self.dm_req_agent())
- cocotb.start_soon(self.dm_rsp_agent())
+
+ async def read_csr(self, addr):
+ val = await self.read_word(0x30000 + addr)
+ return val
+
+ async def write_csr(self, addr, data):
+ await self.write_word(0x30000 + addr, data)
async def slave_awagent(self, timeout=4096):
self.dut.io_axi_slave_write_addr_valid.value = 0
@@ -318,43 +315,6 @@
if timeout_count >= timeout:
assert False, "timeout waiting for rready"
- async def dm_req_agent(self, timeout=4096):
- self.dut.io_dm_req_valid.value = 0
- self.dut.io_dm_req_bits_address.value = 0
- self.dut.io_dm_req_bits_data.value = 0
- self.dut.io_dm_req_bits_op.value = 0
- while True:
- while True:
- await RisingEdge(self.dut.io_aclk)
- self.dut.io_dm_req_valid.value = 0
- if self.dm_req_fifo.qsize():
- break
- req_data = await self.dm_req_fifo.get()
- self.dut.io_dm_req_valid.value = 1
- self.dut.io_dm_req_bits_address.value = req_data["address"]
- self.dut.io_dm_req_bits_data.value = req_data["data"]
- self.dut.io_dm_req_bits_op.value = req_data["op"]
- await FallingEdge(self.dut.io_aclk)
- timeout_count = 0
- while self.dut.io_dm_req_ready.value == 0:
- await FallingEdge(self.dut.io_aclk)
- timeout_count += 1
- if timeout_count >= timeout:
- assert False, "timeout waiting for dm_req_ready"
-
- async def dm_rsp_agent(self):
- self.dut.io_dm_rsp_ready.value = 1
- while True:
- await RisingEdge(self.dut.io_aclk)
- try:
- if self.dut.io_dm_rsp_valid.value:
- rsp = dict()
- rsp["data"] = self.dut.io_dm_rsp_bits_data.value.to_unsigned()
- rsp["op"] = self.dut.io_dm_rsp_bits_op.value.to_unsigned()
- await self.dm_rsp_fifo.put(rsp)
- except Exception as e:
- print('X seen in dm_rsp_agent: ' + str(e))
-
async def memory_write_agent(self):
while True:
while True:
@@ -445,23 +405,43 @@
kelvin_reset_csr_addr = 0x30000
await self.write_word(kelvin_reset_csr_addr, 3)
+ async def _poll_dm_status(self, bit, value):
+ while True:
+ status = await self.read_csr(0x1014)
+ if (status[0] & (1 << bit)) == value:
+ break
+ await ClockCycles(self.dut.io_aclk, 10)
+
async def dm_read(self, addr):
- req = dict()
- req["address"] = addr
- req["data"] = 0
- req["op"] = DmReqOp.READ
- await self.dm_req_fifo.put(req)
- rsp = await self.dm_rsp_fifo.get()
+ await self._poll_dm_status(0, 1)
+
+ await self.write_csr(0x1000, addr)
+ await self.write_csr(0x1004, 0)
+ await self.write_csr(0x1008, DmReqOp.READ)
+
+ await self._poll_dm_status(1, 2)
+
+ rsp = dict()
+ rsp["data"] = int((await self.read_csr(0x100c)).view(np.uint32)[0])
+ rsp["op"] = (await self.read_csr(0x1010)).view(np.uint32)[0]
+ await self.write_csr(0x1014, 0) # Acknowledge response.
+
assert rsp["op"] == DmRspOp.SUCCESS
return rsp["data"]
async def dm_write(self, addr, data):
- req = dict()
- req["address"] = addr
- req["data"] = convert_to_binary_value(np.array([data], dtype=np.uint32).view(np.uint8))
- req["op"] = DmReqOp.WRITE
- await self.dm_req_fifo.put(req)
- rsp = await self.dm_rsp_fifo.get()
+ await self._poll_dm_status(0, 1)
+
+ await self.write_csr(0x1000, addr)
+ await self.write_csr(0x1004, data)
+ await self.write_csr(0x1008, DmReqOp.WRITE)
+
+ await self._poll_dm_status(1, 2)
+
+ rsp = dict()
+ rsp["data"] = int((await self.read_csr(0x100c)).view(np.uint32)[0])
+ rsp["op"] = (await self.read_csr(0x1010)).view(np.uint32)[0]
+ await self.write_csr(0x1014, 0) # Acknowledge response.
return rsp
async def dm_read_reg(self, addr, expected_op=DmRspOp.SUCCESS):