hide:
IREE supports compiling and running TensorFlow Lite (TFLite) programs stored as TFLite FlatBuffers. These files can be imported into an IREE-compatible format then compiled to a series of backends.
graph LR accTitle: TFLite to runtime deployment workflow overview accDescr { Programs start as TensorFlow Lite FlatBuffers. Programs are imported into MLIR's TOSA dialect using iree-import-tflite. The IREE compiler uses the imported MLIR. Compiled programs are used by the runtime. } subgraph A[TFLite] A1[FlatBuffer] end subgraph B[MLIR] B1[TOSA] end C[IREE compiler] D[Runtime deployment] A -- iree-import-tflite --> B B --> C C --> D
Install TensorFlow by following the official documentation:
python -m pip install tf-nightly
Install IREE packages, either by building from source or from pip:
=== “Stable releases”
Stable release packages are [published to PyPI](https://pypi.org/user/google-iree-pypi-deploy/). ``` shell python -m pip install \ iree-compiler \ iree-runtime \ iree-tools-tflite ```
=== “:material-alert: Nightly releases”
Nightly releases are published on [GitHub releases](https://github.com/openxla/iree/releases). ``` shell python -m pip install \ --find-links https://openxla.github.io/iree/pip-release-links.html \ --upgrade \ iree-compiler \ iree-runtime \ iree-tools-tflite ```
IREE's tooling is divided into two components: import and compilation.
These two stages can be completed entirely via the command line.
WORKDIR="/tmp/workdir" TFLITE_URL="https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8.tflite" TFLITE_PATH=${WORKDIR}/model.tflite IMPORT_PATH=${WORKDIR}/tosa.mlir MODULE_PATH=${WORKDIR}/module.vmfb # Fetch the sample model wget ${TFLITE_URL} -O ${TFLITE_PATH} # Import the sample model to an IREE compatible form iree-import-tflite ${TFLITE_PATH} -o ${IMPORT_PATH} # Compile for the CPU backend iree-compile \ --iree-input-type=tosa \ --iree-hal-target-backends=llvm-cpu \ ${IMPORT_PATH} \ -o ${MODULE_PATH}
The example below demonstrates downloading, compiling, and executing a TFLite model using the Python API. This includes some initial setup to declare global variables, download the sample module, and download the sample inputs.
Declaration of absolute paths for the sample repo and import all required libraries. The default setup uses the CPU backend as the only target. This can be reconfigured to select alternative targets.
import iree.compiler.tflite as iree_tflite_compile import iree.runtime as iree_rt import numpy import os import urllib.request from PIL import Image workdir = "/tmp/workdir" os.makedirs(workdir, exist_ok=True) tfliteFile = "/".join([workdir, "model.tflite"]) jpgFile = "/".join([workdir, "input.jpg"]) tfliteIR = "/".join([workdir, "tflite.mlir"]) tosaIR = "/".join([workdir, "tosa.mlir"]) bytecodeModule = "/".join([workdir, "iree.vmfb"]) backends = ["llvm-cpu"] config = "local-task"
The TFLite sample model and input are downloaded locally.
tfliteUrl = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8.tflite" jpgUrl = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8_input.jpg" urllib.request.urlretrieve(tfliteUrl, tfliteFile) urllib.request.urlretrieve(jpgUrl, jpgFile)
Once downloaded we can compile the model for the selected backends. Both the TFLite and TOSA representations of the model are saved for debugging purposes. This is optional and can be omitted.
iree_tflite_compile.compile_file( tfliteFile, input_type="tosa", output_file=bytecodeModule, save_temp_tfl_input=tfliteIR, save_temp_iree_input=tosaIR, target_backends=backends, import_only=False)
After compilation is completed we configure the VmModule using the local-task configuration and compiled IREE module.
config = iree_rt.Config("local-task") context = iree_rt.SystemContext(config=config) with open(bytecodeModule, 'rb') as f: vm_module = iree_rt.VmModule.from_flatbuffer(config.vm_instance, f.read()) context.add_vm_module(vm_module)
Finally, the IREE module is loaded and ready for execution. Here we load the sample image, manipulate to the expected input size, and execute the module. By default TFLite models include a single function named ‘main’. The final results are printed.
im = numpy.array(Image.open(jpgFile).resize((192, 192))).reshape((1, 192, 192, 3)) args = [im] invoke = context.modules.module["main"] iree_results = invoke(*args) print(iree_results)
The tflitehub folder in the iree-samples repository contains test scripts to compile, run, and compare various TensorFlow Lite models sourced from TensorFlow Hub.
An example smoke test of the TensorFlow Lite C API is available here.
| Colab notebooks | |
|---|---|
| Text classification with TFLite and IREE |
Failures during the import step usually indicate a failure to lower from TensorFlow Lite‘s operations to TOSA, the intermediate representation used by IREE. Many TensorFlow Lite operations are not fully supported, particularly those than use dynamic shapes. Please reach out on one of IREE’s communication channels if you notice something missing.