Update docs on TFLite benchmarking. (#3667)

- Add example to build benchmark_tool with Android
- Add a tip to turn on logging for TFLite
diff --git a/docs/developing_iree/e2e_benchmarking.md b/docs/developing_iree/e2e_benchmarking.md
index 6f27bc1..1a5b54b 100644
--- a/docs/developing_iree/e2e_benchmarking.md
+++ b/docs/developing_iree/e2e_benchmarking.md
@@ -275,9 +275,25 @@
 The first two are to build it directly, either in a
 [`docker` container](https://www.tensorflow.org/lite/guide/build_android#set_up_build_environment_using_docker)
 or
-[in your own environment](https://www.tensorflow.org/lite/guide/build_android#set_up_build_environment_without_docker). Assuming you can build
-TensorFlow with Android, you can configure the TFLite `benchmark_model` binary
-in the following ways:
+[in your own
+environment](https://www.tensorflow.org/lite/guide/build_android#set_up_build_environment_without_docker).
+To build TensorFlow tools with Android:
+
+- Run `./configure` under TensorFlow repo.
+- Add the following section to the TensorFlow WORKSPACE file.
+
+```
+android_ndk_repository(
+    name="androidndk",
+    path="/full/path/to/android_ndk",
+)
+```
+
+TODO(hanchung): Place the Android setup to somewhere outside IREE, e.g.,
+TensorFlow.
+
+Then you can configure the TFLite `benchmark_model` binary in the following
+ways:
 
 ```shell
 # Build the benchmark_model binary without any add-ons.
@@ -392,6 +408,11 @@
 the name of the trace that you want to benchmark, but you can use `cat` on
 the `graph_path` file to verify the correct `.tflite` filename if you're unsure.
 
+Tip:<br>
+nbsp;&nbsp;&nbsp;&nbsp;Sometimes `benchmark_tool` falls back to use CPU even
+when `use_gpu` is set. To get more information, you can turn on traces in the
+tool by `adb shell setprop debug.tflite.trace 1`.
+
 ### Profile
 
 There are 2 profilers built into TFLite's `benchmark_model` program. Both of