PyTorch Integration

IREE supports compiling and running PyTorch programs represented as nn.Module classes as well as models defined using functorch.

Prerequisites

Install IREE pip packages, either from pip or by building from source:

pip install \
  iree-compiler \
  iree-runtime

Install torch-mlir, necessary for compiling PyTorch models to a format IREE is able to execute:

pip install -f https://llvm.github.io/torch-mlir/package-index/ torch-mlir

A special iree_torch package is available to make it easy to compile PyTorch programs and run them on IREE:

pip install git+https://github.com/iree-org/iree-torch.git

Running a model

Going from a loaded PyTorch model to one that's executing on IREE happens in four steps:

  1. Compile the model to MLIR
  2. Compile the MLIR to IREE VM flatbuffer
  3. Load the VM flatbuffer into IREE
  4. Execute the model via IREE

!!! note In the following steps, we'll be borrowing the model from this BERT colab and assuming it is available as model.

Compile the model to MLIR

First, we need to trace and compile our model to MLIR:

model = # ... the model we're compiling
example_input = # ... an input to the model with the expected shape and dtype
mlir = torch_mlir.compile(
    model,
    example_input,
    output_type=torch_mlir.OutputType.LINALG_ON_TENSORS,
    use_tracing=True)

The full list of available output types can be found here and includes linalg on tensors, mhlo, and tosa.

Compile the MLIR to IREE VM flatbuffer

Next, we compile the resulting MLIR to IREE VM flatbuffer:

iree_backend = "llvm-cpu"
iree_vmfb = iree_torch.compile_to_vmfb(mlir, iree_backend)

Here we have a choice of backend we want to target. See IREE Deployment Configurations for a full list of targets.

The generated flatbuffer can now be serialized and stored for another time or loaded and executed immediately.

Load the VM flatbuffer into IREE

Next, we load the flatbuffer into the IREE runtime. iree_torch provides a convenience method for loading this flatbuffer:

invoker = iree_torch.load_vmfb(iree_vmfb, iree_backend)

Execute the model via IREE

Finally, we can execute the loaded model on IREE:

result = invoker.forward(example_input)

Training

Training with PyTorch in IREE is supported via functorch. The steps for loading the model, once defined, into IREE, is nearly identical to the above example.

You can find a full end-to-end example of defining a basic regression model, training with it, and running inference on it here.

Samples

Colab notebooks
Inference on BERTOpen in Colab
Example scripts
Basic Inference and Training Example