blob: 59d4faa8414351ea69a3bd5eea57c0fe1a9c4ac4 [file] [log] [blame] [view]
bjacob39073c02020-12-04 16:58:56 -05001# Vulkan GPU Profiling
2
3[Tracy](./profiling_with_tracy.md) offers great insights into CPU/GPU
4interactions and Vulkan API usage
5details. However, information at a finer granularity, especially inside a
6particular shader dispatch, is missing. To supplement general purpose tools
7like Tracy, vendor-specific tools can be used.
8
9(TODO: add some pictures for each tool)
10
11## Android GPUs
12
13There are multiple GPU vendors for the Android platforms, each offering their
14own tools. [Android GPU Inspector](https://gpuinspector.dev/)
15(AGI) provides a cross-vendor solution. See the
16[documentation](https://gpuinspector.dev/docs/) for more details.
17
18### Build Android app to run IREE
19
20In order to perform capture and analysis with AGI, you will need a full Android
21app. In IREE we have a simple Android native app wrapper to help package
22IREE core libraries together with a specific VM bytecode invocation into an
23Android app. The wrapper and its documentation are placed at
24[`iree/tools/android/run_module_app/`](https://github.com/google/iree/tree/main/iree/tools/android/run_module_app).
25
26For example, to package a module compiled from the following `mhlo-dot.mlir` as
27an Android app:
28
29```mlir
Ben Vanikf980bf62021-06-11 20:54:27 -070030func @dot(%lhs: tensor<2x4xf32>, %rhs: tensor<4x2xf32>) -> tensor<2x2xf32> {
bjacob39073c02020-12-04 16:58:56 -050031 %0 = "mhlo.dot"(%lhs, %rhs) : (tensor<2x4xf32>, tensor<4x2xf32>) -> tensor<2x2xf32>
32 return %0 : tensor<2x2xf32>
33}
34```
35
36```shell
37# First translate into a VM bytecode module
38$ /path/to/iree/build/iree/tools/iree-translate -- \
CindyLiuaf7dfeb2021-06-08 22:31:37 +000039 -iree-input-type=mhlo \
bjacob39073c02020-12-04 16:58:56 -050040 -iree-mlir-to-vm-bytecode-module \
CindyLiuaf7dfeb2021-06-08 22:31:37 +000041 -iree-hal-target-backends=vulkan-spirv \
bjacob39073c02020-12-04 16:58:56 -050042 /path/to/mhlo-dot.mlir \
43 -o /tmp/mhlo-dot.vmfb
44
45# Then package the Android app
46$ /path/to/iree/source/iree/tools/android/run_module_app/build_apk.sh \
47 ./build-apk \
Ben Vanikebeb5fc2021-04-24 09:40:50 -070048 --driver vulkan \
bjacob39073c02020-12-04 16:58:56 -050049 --module_file /tmp/mhlo-dot.vmfb \
50 --entry_function dot \
Ben Vanikebeb5fc2021-04-24 09:40:50 -070051 --function_input=...
bjacob39073c02020-12-04 16:58:56 -050052```
53
54Where `/path/to/input/file` is a file containing inputs to `dot`, for example:
55
56```
572x4xf32=[[1.0 2.0 3.0 4.0][5.0 6.0 7.0 8.0]]
584x2xf32=[[9.0 10.0][11.0 12.0][13.0 14.0][15.0 16.0]]
59```
60
61The above will build an `iree-run-module.apk` under the `./build-apk/`
62directory, which you can then install via `adb install`.
63
64`build_apk.sh` needs the Android SDK and NDK internally, an easy way to manage
65them is by installing [Android Studio](https://developer.android.com/studio).
66After installation, you will need to set up a few environment variables, which
67are printed at the beginning of `build_apk.sh` invocation.
68
69### Capture and analyze with AGI
70
71You can follow AGI's
72[Getting Started](https://gpuinspector.dev/docs/getting-started) page to learn
73how to use it. In general the steps are:
74
75* Install the latest AGI from https://github.com/google/agi/releases and launch.
76* Fill in the "Application" field by searching the app. The line should read
77 like `android.intent.action.MAIN:com.google.iree.run_module/android.app.NativeActivity`.
78* Select start at beginning and choose a proper duration.
79* Configure system profile to include all GPU counters.
80* Start capture.
81
82Generated traces are in the [perfetto](https://perfetto.dev/) format. They can
83be viewed directly within AGI and also online in a browser at
84https://ui.perfetto.dev/, without needing an Android device.
85
86## Desktop GPUs
87
88Vulkan supports both graphics and compute, but most tools in the Vulkan
89ecosystem focus on graphics. As a result, some Vulkan profiling tools expect
90commands to correspond to a sequence of frames presented to displays via
91framebuffers. This means additional steps for IREE and other Vulkan
92applications that solely rely on headless compute. For graphics-focused tools,
93we need to wrap IREE's logic inside a dummy rendering loop in order to provide
94the necessary markers for these tools to perform capture and analysis.
95
96IREE provides an `iree-run-module-vulkan-gui` binary that can invoke a specific
97bytecode module within a proper GUI application. The graphics side is leveraging
98[Dear ImGui](https://github.com/ocornut/imgui); it calls into IREE
99synchronously during rendering each frame and prints the bytecode invocation
100results to the screen.
101
102To build `iree-run-module-vulkan-gui`:
103
104```shell
105# Using Bazel
106$ bazel build //iree/testing/vulkan:iree-run-module-vulkan-gui
107
108# Using CMake
109$ cmake --build /path/to/build/dir --target iree-run-module-vulkan-gui
110```
111
112The generated binary should be invoked in a console environment and it takes
113the same command-line options as the main
114[`iree-run-module`](./developer-overview.md#iree-run-module), except the
115`--driver` option. You can use `--help` to learn them all. The binary will
116launch a GUI window for use with Vulkan tools.
117
118### AMD
119
120For AMD GPUs, [Radeon GPU Profiler](https://gpuopen.com/rgp/) (RGP) is the tool
121to understand fine details of how IREE GPU performs. See the
122[documentation](https://radeon-gpuprofiler.readthedocs.io/en/latest/) for
123details. In general the steps to get started are:
124
125* Download and install AMD RGP from https://gpuopen.com/rgp/.
126* Compile `iree-run-module-vulkan-gui` as said in the above.
127* Open "Radeon Developer Panel" and connect to the local
128 "Radeon Developer Service".
129* Start `iree-run-module-vulkan-gui` from console with proper VM bytecode module
130 invocation.
131* You should see it in the "Applications" panel of "Radeon Developer Panel".
132 Click "Capture profile" to capture.
133
134Afterwards you can analyze the profile with RGP. Viewing the profile does not
135need the GPU anymore; it can be opened by a RGP application installed anywhere.
136
137### NVIDIA
138
139For NVIDIA GPUs, [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics)
140is the tool to understand fine details of how IREE GPU performs. See the
141[documentation](https://docs.nvidia.com/nsight-graphics/UserGuide/index.html)
142for details. In general the steps to get started are:
143
144* Download and install NVIDIA Nsight Graphics from https://developer.nvidia.com/nsight-graphics.
145* Compile `iree-run-module-vulkan-gui` as said in the above.
146* Open NVIDIA Nsight Graphics, select "Quick Launch" on the welcome page.
147* Fill out the "Application Executable" and "Command Line Arguments" to point
148 to `iree-run-module-vulkan-gui` and a specific VM bytecode module and its
149 invocation information.
150* Select an "Activity" ("Frame Profiler" and "GPU Trace" are particularly
151 interesting) and launch.
152* Capture any frame to perform analysis.