This is a collection of e2e tests that save a TensorFlow model, compile it with IREE, run it on multiple backends and crosscheck the results.
You will need a TensorFlow 2.0+ nightly installed in your python environment: the python binary in $PYTHON_BIN
should be able to import tensorflow
and that TensorFlow should be version 2.0+. This can be checked with tensorflow.version
.
See Install TensorFlow with pip for instructions.
If you do not have your environment setup to use IREE with Vulkan (see the doc), then you can run the tests with IREE_AVAILABLE_BACKENDS=tf,iree_vmla,iree_llvmjit
(that is, by omitting iree_vulkan
from the list of available backends).
# For locally running tests and iterating on backend development, # `bazel run` is preferred. bazel run :math_test_manual -- --override_backends=iree_vmla # Same as above, but add `tf` backend to cross-check numerical correctness. bazel run :math_test_manual -- --override_backends=tf,iree_vmla # Run all tests with defaults and output on failure. bazel test ... --test_output=errors # Run an individual test interactively. bazel run :math_test_manual -- --test_output=streamed
If you specify the same backend multiple times, for example --override_backends=iree_vmla,iree_vmla
. The same backends are grouped and in this example iree_vmla
will run once. If you specify tf,iree_vmla
as backends, then we will test both backends and compare them with each other. If you specify tf
backend only, then we will also test tf
vs tf
to capture any model initialization/randomization issues (it is a special case for debug purpose). For reproducibility of the unit tests we set random seed of tf
and numpy
by calling tf_test_utils.set_random_seed()
before model creation.
Test targets are automatically generated for each test file and for each backend to check numerical correctness against TensorFlow. Tests targets that pass are placed into the e2e_tests
test suite. Tests that fail on particular backends are recorded in lists in the BUILD
files. For example, if experimental_new_test.py
fails on the iree_llvmjit
and iree_vulkan
backends then the following lines should be added to the BUILD
file:
LLVM_FAILING = [ ... "experimental_new_test.py", ... ] VULKAN_FAILING = [ ... "experimental_new_test.py", ... ]
Test targets for these backends are placed into the e2e_tests_failing
test suite. Test targets in these test suites can be run as follows:
# Run all e2e tests that are expected to pass. bazel test :e2e_tests # Run all e2e tests that are expected to fail. bazel test :e2e_tests_failing # Run a specific failing e2e test target. # Note that generated test targets are prefixed with their test suite name. bazel test :e2e_tests_failing_broadcasting_test__tf__iree_vulkan
If the compiler fails to compile the program, then it will create a crash reproducer (see MLIR documentation), which then allows reproducing the bug with an appropriate “opt” tool. Further debugging iteration can happen in opt.
TODO(silvasean): debugging miscompiles
See simple_arithmetic_test.py
for some basic examples.
The BUILD file specifies which targets work on which backends and controls which backends tests are run on by using the --override_backends
flag.
The @tf_test_utils.compile_modules
decorator on tests also takes a backends=
keyword argument. Many tests still specify this, but it is ignored in the CI, which runs with bazel test
. When running with bazel run
this indicates the set of backends to use in the absence of the --override_backends
flags (and accepts the same arguments).
Example:
@tf_test_utils.compile_modules(backends=["tf"], mlp=(Mlp, ["predict"])) class DynamicMlpTest(tf_test_utils.SavedModelTestCase): ... the test case ...
Limiting backends is useful for tests that are known to fail on certain backends but are still useful to have checked in.
The priority order for which backends are ultimately used is:
The backends specified in --override_backends
.
The backends specified in the IREE_OVERRIDE_BACKENDS
environment variable.
The backends specified in the tf_test_utils.compile_modules
decorator.
All known backends.
Additionally, the environment variable IREE_AVAILABLE_BACKENDS
specifies which backends should be considered available in a particular environment. Once the list of backends above is formed, any backends not listed in IREE_AVAILABLE_BACKENDS
are removed. This is the final list of backends which are run for the test.
The default behavior if IREE_AVAILABLE_BACKENDS
is not provided is that all known backends are considered available.
TODO(silvasean): IREE_AVAILABLE_BACKENDS
is mainly to allow masking off the Vulkan backend in environments where it is not a available. Currently, the behavior when all backends get masked off is to emit a warning, which can result in spuriously “passing” tests. This is only an issue for tests that currently only run on Vulkan (which should decrease over time as e.g. VMLA gets more coverage).