IREE supports compiling and running PyTorch programs represented as nn.Module classes as well as models defined using functorch.
Install IREE pip packages, either from pip or by building from source:
pip install \ iree-compiler \ iree-runtime
Install torch-mlir, necessary for compiling PyTorch models to a format IREE is able to execute:
pip install -f https://llvm.github.io/torch-mlir/package-index/ torch-mlir
A special iree_torch package is available to make it easy to compile PyTorch programs and run them on IREE:
pip install git+https://github.com/iree-org/iree-torch.git
Going from a loaded PyTorch model to one that's executing on IREE happens in four steps:
!!! note In the following steps, we'll be borrowing the model from this BERT colab and assuming it is available as model.
First, we need to trace and compile our model to MLIR:
model = # ... the model we're compiling example_input = # ... an input to the model with the expected shape and dtype mlir = torch_mlir.compile( model, example_input, output_type=torch_mlir.OutputType.LINALG_ON_TENSORS, use_tracing=True)
The full list of available output types can be found here and includes linalg on tensors, mhlo, and tosa.
Next, we compile the resulting MLIR to IREE VM flatbuffer:
iree_backend = "llvm-cpu" iree_vmfb = iree_torch.compile_to_vmfb(mlir, iree_backend)
Here we have a choice of backend we want to target. See IREE Deployment Configurations for a full list of targets.
The generated flatbuffer can now be serialized and stored for another time or loaded and executed immediately.
Next, we load the flatbuffer into the IREE runtime. iree_torch provides a convenience method for loading this flatbuffer:
invoker = iree_torch.load_vmfb(iree_vmfb, iree_backend)
Finally, we can execute the loaded model on IREE:
result = invoker.forward(example_input)
Training with PyTorch in IREE is supported via functorch. The steps for loading the model, once defined, into IREE, is nearly identical to the above example.
You can find a full end-to-end example of defining a basic regression model, training with it, and running inference on it here.
| Colab notebooks | |
|---|---|
| Inference on BERT |
| Example scripts |
|---|
| Basic Inference and Training Example |