Update the sel4 debugging primer. Change-Id: I913d8fe3844b805d7aed159368da704a8bc9e8d0
diff --git a/KataGdbUserSpaceDebugging.md b/KataGdbUserSpaceDebugging.md index a360ee0..994637d 100644 --- a/KataGdbUserSpaceDebugging.md +++ b/KataGdbUserSpaceDebugging.md
@@ -3,16 +3,21 @@ ## High-Level Usage The method for debugging seL4 threads with Renode is relatively simple, and -involves the usual suspects of `m simulate-debug` and `kgdb.sh`, but the +involves the usual suspects of `m debug-simulation` (or `m simulate-debug`) and `kgdb.sh`, but the interface through GDB has changed slightly. In general, a debugging session now looks like the following: On one terminal, start an instance of Renode: ```bash source build/setup.sh -m simulate-debug +m debug-simulation ``` +[Note `m debug-simulation` gives GDB +control before anything runs while `m simulate-debug` lets Renode run +until `kgdb.sh` is invoked and contacts GDB. The former is necessary to +debug issues in the boot process (bootstrap, kernel, rootserver).] + Once Renode has started and starts outputting green debug logs, in another terminal start up GDB with: ```bash @@ -20,81 +25,133 @@ ``` At this point, GDB should connect to the running Renode instance started by -`simulate-debug`, and you'll see something equivalent to the following (output +`debug-simulation`, and you'll see something equivalent to the following (output is not exact -- symbol reading may differ): ``` ghost:~/$ kgdb.sh -Source directories searched: /usr/local/google/home/jtgans/shodan/sw/tock:$cdir:$cwd -Reading symbols from out/shodan_boot_rom/build-out/multihart_boot_rom/multihart_boot_rom_sim_verilator.elf... -add symbol table from file "out/shodan_boot_rom/build-out/multihart_boot_rom/multihart_boot_rom_sim_verilator.elf" -Reading symbols from out/shodan_boot_rom/build-out/multihart_boot_rom/multihart_boot_rom_sim_verilator.elf... -add symbol table from file "out/matcha/riscv32imc-unknown-none-elf/debug/matcha_platform" -Reading symbols from out/matcha/riscv32imc-unknown-none-elf/debug/matcha_platform... -add symbol table from file "out/matcha/riscv32imc-unknown-none-elf/debug/matcha_app" -Reading symbols from out/matcha/riscv32imc-unknown-none-elf/debug/matcha_app... +Source directories searched: /usr/local/google/home/sleffler/shodan/sw/tock:$cdir:$cwd +Reading symbols from out/shodan_boot_rom/multihart_boot_rom/multihart_boot_rom.elf... +add symbol table from file "out/shodan_boot_rom/multihart_boot_rom/multihart_boot_rom.elf" +Reading symbols from out/shodan_boot_rom/multihart_boot_rom/multihart_boot_rom.elf... Remote debugging using localhost:3333 warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread -0x20007b4a in tock_cells::map_cell::MapCell<kernel::callback::AppId>::is_some<kernel::callback::AppId> (self=0x10005af4 <matcha_platform::reset_handler::BUF>) - at /usr/local/google/home/jtgans/shodan/sw/tock/libraries/tock-cells/src/map_cell.rs:47 -47 self.occupied.get() +_reset_start () at /usr/local/google/home/sleffler/shodan/sw/multihart_boot_rom/multihart_boot_rom/rom_crt.S:29 +29 li x7, 0x0 +Symbol autoswitching is now disabled (gdb) ``` At this point, GDB has stopped the simulator, giving you a chance to set breakpoints and examine the system. -`kgdb.sh` will automatically load symbols on the seL4 side, based upon their -thread names, so there is no need to manually find the associated symbol file +`kgdb.sh` will load symbols on the seL4 side, based on their +thread / CAmkES component, so there is no need to manually find the associated symbol file and load it with the correct offsets. It may be necessary to switch symbol -tables, which can easily be done using `sel4 switch-symbols <threadname>`, which -will automatically load them with the appropriate offsets. +tables, which can easily be done using `sel4 switch-symbols <component>`, which +will automatically load the symbols for CAmkES `<component>` with the appropriate offsets +(a `<component>` is the lower-case version of the CAmkES name, e.g. the CAmkES component +`DebugConsole` translates to `debug_console`.) In this environment, there are two sets of breakpoints that can be used: -standard gdb `break`-style breakpoints, and renode-side `sel4 break` +standard GDB `break`-style breakpoints, and renode-side `sel4 break` breakpoints. The former work within the currently executing thread and can be used for "local" breaks inside of a thread, whereas the latter can be set in any thread that was previously discovered by the Renode extension. -Once you've setup your breakpoints using `sel4 break`, you'll need to tell -Renode to enable breakpoints as well with `sel4 breakpoints-enabled True`, and -then `continue`. For example, consider this debugging session: +Once a breakpoint is setup using `sel4 break` one can just +`continue`. For example, consider this debugging session: ``` -<assuming cpu1 is halted and gdb is waiting at a prompt> +<assuming cpu1 is halted and GDB is waiting at a prompt> -(gdb) # Wait until 'rootserver' thread is set up in seL4 (gdb) sel4 wait-for-thread rootserver -(gdb) # rootserver thread is now the symbol "main" in the thread/program, so we -(gdb) # can break on stopped and associated symbols are loaded. +Thread 2 "matcha.cpu1" received signal SIGTRAP, Trace/breakpoint trap. +[Switching to Thread 2] +0xff814802 in ?? () +(gdb) # Switch to the 'rootserver' thread's symbol table and set a breakpoint on main +(gdb) sel4 switch-symbols rootserver +Reading symbols from out/kata/riscv32-unknown-elf/debug/capdl-loader... (gdb) sel4 break rootserver main - -(gdb) # Enable extension breakpoints (it's disabled by default as it's pretty -(gdb) # performance-heavy as Renode has to break on every context-switch) -(gdb) sel4 breakpoints-enabled True (gdb) c +Continuing. -(gdb) # We are in capdl-loader main() function, disable extension breakpoints -(gdb) sel4 breakpoints-enabled False - +Thread 2 "matcha.cpu1" received signal SIGTRAP, Trace/breakpoint trap. +kata_os_rootserver::main () at src/main.rs:164 +164 pub fn main() { +(gdb) (gdb) # As we are in capdl-loader thread and proper symbols are loaded, all commands -(gdb) # like break are working correctly +(gdb) # like break work as expected. (gdb) b init_system +Breakpoint 1 at 0x4c632: file /usr/local/google/home/sleffler/shodan/kata/projects/kata/apps/system/components/kata-os-common/src/model/mod.rs, line 200. (gdb) c +Continuing. + +Thread 2 "matcha.cpu1" hit Breakpoint 1, model::KataOsModel::init_system (self=0x8e3cc8) + at /usr/local/google/home/sleffler/shodan/kata/projects/kata/apps/system/components/kata-os-common/src/model/mod.rs:200 +200 self.init_copy_region(); +``` + +Note that symbol tables are per-component but breakpoints are +per-thread. The seL4 GDB extentions assume all threads in a component +share the same VSpace and hence the same symbol table. Beware also +that when symbol names are ambiguous you can specify fully-qualified +names; e.g. above `main` was expanded to `kata_os_rootserver::main` and +`init_system` to `model::KataOsModel::init_system`. Several well-known +special symbols are supported: `kernel` for the seL4 Kernel, `user` +for any thread running in user space, and `rootserver` for the first +seL4 thread that runs after boot. + +## Temporary Breakpoints + +In addition to regular GDB breakpoints the sel4 extentions support temporary +breakpoints. The syntax is `sel4 tbreak <threadname> [<symbol>]`. If `<symbol>` +is not specified GDB will stop on the next entry to `<threadname>`. This is +especially useful for stopping the next time you enter user space: + +``` +(gdb) sel4 thread +kernel +(gdb) sel4 tbreak user +(gdb) c +Continuing. +Thread 2 "matcha.cpu1" received signal SIGTRAP, Trace/breakpoint trap. +[Switching to Thread 2] +0xff80311c in ?? () +``` + +or the kernel: + +``` +(gdb) sel4 tbreak kernel +(gdb) sel4 list-breakpoints +-------------------------- +|Thread|Address|Temporary| +-------------------------- +|kernel|any |True | +-------------------------- +(gdb) c +Continuing. + +Thread 2 "matcha.cpu1" received signal SIGTRAP, Trace/breakpoint trap. +0xff803000 in ?? () +(gdb) ``` ## Theory of Operation -The kgdb.sh / Renode symbol lookup mechanism is composed of two parts: a custom -Renode extension in `sim/config/shodan_infrastructure/seL4Extensions.cs` and a -custom GDB extension in `sim/config/gdbscript.py`. `kgdb.sh` is a simple bash -script that starts GDB and connects to the appropriate remote with the extension -loaded. +The kgdb.sh / Renode symbol lookup mechanism is +composed of two parts: a custom Renode extension in +`sim/config/shodan_infrastructure/seL4Extensions.cs` and a custom GDB +extension in `sim/renode/tools/sel4_extensions/gdbscript.py`. `kgdb.sh` +is a simple bash script that starts GDB and connects to the appropriate +remote with the extension loaded. -The Renode extension tracks the VSpace setup for each thread and translates -virtual addresses to physical addresses. Without this work symbols may be -incorrectly resolved because every VSpace memory range starts at 0. +The Renode extension tracks the VSpace setup for each thread and +translates virtual addresses to physical addresses. Without this work +symbols will be (almost always) incorrectly resolved because every VSpace +memory range starts at 0. In addition, the extension provides commands to Renode to add debugging hooks to an execution, all prefixed with the subcommand `seL4` (case matters at the @@ -109,15 +166,42 @@ to load in symbols for the current thread context from disk when the Renode extension detects a context switch. -During debugging, not all symbols will be loaded in GDB at the start, so it can +During debugging, most symbols will not be loaded in GDB at the start, so it can be helpful to use `sel4 wait-for-thread <threadname>` to tell Renode to break -execution upon entry of the given thread. When the break occurs, Renode will -instruct GDB to load the correct symbol table, and then symbols can be resolved -for later breakpoints, etc. +execution upon entry of the given thread. When a break occurs the appropriate +symbols may be loaded with the `sel4 switch-symbols <threadname>` command or, +if `sel4 symbol-autoswitching true` is set then Renode will automatically switch +symbol tables based on the current thread. Symbol-autoswitching is disabled by +default because there GDB does not have a reliable way to synchronize the +cmdline with the runtime so repeatedly switching between threads can leave +GDB in a bad state where it will crash on an assert. -To set a breakpoint for a thread that is not the current context, use the `sel4 -break <threadname> <symbol>` command. This will instruct Renode to load the -appropriate symbol table and break at the right place once that thread is -running. If, however, you need to set a breakpoint in the current thread -context, it's fine to use `break` as normal, since the symbols for the current -thread should be loaded. +To set a breakpoint for a thread that is not the current context, the thread +needs to be entered at least once. After that symbols in that thread can be +resolved so the `sel4 break <threadname> <symbol>` command works. Beware that +if `<symbol>` is wrong (e.g. mis-typed) the error diagnostic will go to the +Renode console, not to the GDB cmdline. Note also that using regular GDB +breakpoints will generate complaints about not being able to set the breakpoints +when switching to a different thread. Best to use `sel4 break` in seL4 user +threads and limit GDB `break` to contexts like the kernel or multihart_boot_rom. + +## Esoterica + +The Renode extensions identify seL4 threads by intercepting +`seL4_DebugNameThread` system calls that assign a thread an ASCII +string. Typically this happens when the thread first starts up but +it can also happen otherwise; e.g. when the rootserver constructs +a thread it may assign a name. This means that a command like `sel4 +wait-for-thread <threadname>` may stop GDB in an unexpected context (e.g. +the rootserver). + +In order for the Renode extension to intercept `seL4_DebugNameThread` +system calls it must know the system call ID. Because seL4 does not +have a fixed ABI this number may change depending on how the system is +built and/or if the system source is modified. To deal with this the +system call ID is specified as an argument to the `CreateSeL4` Renode +directive; e.g. `monitor cpu1 CreateSeL4 0xffffffee`. If you notice seL4 +threads are not identified and simulations slow to a crawl then check +the setting--when there is a msimatch Renode will be single-stepping +execution waiting for threads to be entered for the first time (the +`sel4 ready` command may also be helpful in diagnosing this behaviour).