blob: 16c6b51917bde3bf6049e11fdce9a0399a69dee1 [file] [log] [blame] [view]
<!-- mdformat off(b/169948621#comment2) -->
<!--
Semi-automated TOC generation with instructions from
https://github.com/ekalinin/github-markdown-toc#auto-insert-and-update-toc
-->
<!--ts-->
* [Profiling](#profiling)
* [API](#api)
* [Per-Op Profiling](#per-op-profiling)
* [Subroutine Profiling](#subroutine-profiling)
<!-- Added by: njeff, at: Wed 04 Nov 2020 04:35:07 PM PST -->
<!--te-->
# Profiling
This doc outlines how to use the TFLite Micro profiler to gather information
about per-op invoke duration and to use the profiler to identify bottlenecks
from within operator kernels and other TFLite Micro routines.
## API
The MicroInterpreter class constructor contains an optional profiler argument.
This profiler must be an instance of the tflite::Profiler class, and should
implement the BeginEvent and EndEvent methods. There is a default implementation
in tensorflow/lite/micro/micro_profiler.cc which can be used for most purposes.
The best practice for profiling across multiple invocations is to reset or call
`ClearEvents()` in between invocations.
## Per-Op Profiling
There is a feature in the MicroInterpreter to enable per-op profiling. To enable
this, provide a MicroProfiler to the MicroInterpreter's constructor then build
with a non-release build to disable the NDEBUG define surrounding the
ScopedOperatorProfile within the MicroInterpreter.
## Subroutine Profiling
In order to further dig into performance of specific routines, the MicroProfiler
can be used directly from the TFLiteContext or a new MicroProfiler can be
created if the TFLiteContext is not available where the profiling needs to
happen. The MicroProfiler's BeginEvent and EndEvent can be called directly, or
wrapped using a [ScopedProfile](../../lite/core/api/profiler.h).