| # TensorFlow e2e tests |
| |
| This is a collection of e2e tests that save a TensorFlow model, compile it with |
| IREE, run it on multiple backends and crosscheck the results. |
| |
| ## Pre-Requisites |
| |
| You will need a TensorFlow 2.0+ nightly installed in your python environment: |
| the python binary in `$PYTHON_BIN` should be able to `import tensorflow` and |
| that TensorFlow should be version 2.0+. This can be checked with |
| `tensorflow.version`. |
| |
| See [Install TensorFlow with pip](https://www.tensorflow.org/install/pip) for |
| instructions. |
| |
| ## Vulkan setup |
| |
| If you do not have your environment setup to use IREE with Vulkan (see |
| [the doc](../../../docs/vulkan_and_spirv.md)), then you can run the manual test |
| targets with `--target_backends=tf,iree_vmla,iree_llvmjit` (that is, by omitting |
| `iree_vulkan` from the list of backends to run the tests on). |
| |
| The test suites can be run excluding Vulkan by specifying |
| `--test_tag_filters="-driver=vulkan"` in the `bazel test` invocation, or by |
| adding `test --test_tag_filters="-driver=vulkan"` to your `user.bazelrc`. |
| |
| ## Compiling `tf.Module`s |
| |
| Compatible TensorFlow modules can be compiled to specific IREE backends using |
| `IreeCompiledModule`. This also optionally saves compilation artifacts to a |
| specified directory. These artifacts include: MLIR across various lowerings, a |
| TensorFlow SavedModel, and the compiled VM FlatBuffer. A basic example of |
| creating and calling an `IreeCompiledModule` can be found in |
| [`tf_utils_test.py`](https://github.com/google/iree/blob/main/integrations/tensorflow/bindings/python/pyiree/tf/support/tf_utils_test.py) |
| |
| When using Keras models or tf.Modules with functions that IREE can't compile, |
| `exported_names` should be specified. For example: |
| |
| ```python |
| from pyiree.tf.support import tf_utils |
| vmla_module = tf_utils.IreeCompiledModule( |
| module_class=KerasTFModuleClass, |
| backend_info=tf_utils.BackendInfo('iree_vmla'), |
| exported_names=['predict']) |
| vmla_module.predict(...) |
| ``` |
| |
| By default the TensorFlow SavedModels will not be kept. This can be overridden |
| via the `--keep_saved_model` flag. |
| |
| ## Running tests |
| |
| For locally running tests and iterating on backend development, `bazel run` is |
| preferred. |
| |
| ```shell |
| # Run math_test on all backends. |
| bazel run :math_test_manual |
| |
| # Run math_test comparing TensorFlow to itself (e.g. to debug randomization). |
| bazel run :math_test_manual -- target_backends=tf |
| |
| # Run math_test comparing the VMLA backend and TensorFlow. |
| bazel run :math_test_manual -- --target_backends=iree_vmla |
| |
| # Run math_test comparing the VMLA backend to itself multiple times. |
| bazel run :math_test_manual -- \ |
| --reference_backend=iree_vmla --target_backends=iree_vmla,iree_vmla |
| |
| # Run math_test and output on failure. |
| bazel test :math_test_manual --test_output=errors |
| |
| # Run an individual test interactively. |
| bazel run :math_test_manual -- --test_output=streamed |
| ``` |
| |
| For reproducibility of the unit tests `CompiledModule()` sets the random seeds |
| of `tf`, `numpy` and `python` by calling `tf_utils.set_random_seed()` before |
| model creation. |
| |
| ## Writing Tests |
| |
| Our tests use a class `TracedModule` to capture and store all of the inputs and |
| outputs of a `CompiledModule` in a `Trace`. Each unittest on a `TestCase` uses |
| the `compare_backends` method. This method runs the function it is passed with a |
| `TracedModule` once for each reference and target backend. The inputs and |
| outputs to these modules are then checked for correctness, using the reference |
| backend as a source of truth. For example: |
| |
| ```python |
| # Compile a `tf.Module` named `SimpleArithmeticModule` into a `CompiledModule`. |
| @tf_test_utils.compile_module(SimpleArithmeticModule) |
| # Inherit from `TracedModuleTestCase`. |
| class SimpleArithmeticTest(tf_test_utils.TracedModuleTestCase): |
| |
| # Unit test. |
| def test_simple_mul(self): |
| |
| # Trace function. |
| def simple_mul(module): |
| # A random seed is automatically set before each call to `simple_mul`. |
| a = tf_utils.uniform([4]) |
| b = np.array([400., 5., 6., 7.], dtype=np.float32) |
| # The inputs `a` and `b` are recorded along with the output `c` |
| c = module.simple_mul(a, b) |
| # The inputs `a` and `b` are recorded along with the (unnamed) output |
| # module.simple_mul returns. |
| module.simple_mul(a, c) |
| |
| # Calls `simple_mul` once for each backend, recording the inputs and outputs |
| # to `module` and then comparing them. |
| self.compare_backends(simple_mul) |
| ``` |
| |
| ## Test Suites |
| |
| Test targets are automatically generated for each test file and for each backend |
| to check numerical correctness against TensorFlow. Tests targets that pass are |
| placed into the `e2e_tests` test suite. Tests that fail on particular backends |
| are recorded in lists in the `BUILD` files. For example, if |
| `experimental_new_test.py` fails on the `iree_llvmjit` and `iree_vulkan` |
| backends then the following lines should be added to the `BUILD` file: |
| |
| ```build |
| LLVM_FAILING = [ |
| ... |
| "experimental_new_test.py", |
| ... |
| ] |
| |
| VULKAN_FAILING = [ |
| ... |
| "experimental_new_test.py", |
| ... |
| ] |
| ``` |
| |
| Test targets for these backends are placed into the `e2e_tests_failing` test |
| suite. Test targets in these test suites can be run as follows: |
| |
| ```shell |
| # Run all e2e tests that are expected to pass. |
| bazel test :e2e_tests |
| |
| # Run all e2e tests that are expected to fail. |
| bazel test :e2e_tests_failing |
| |
| # Run a specific failing e2e test target. |
| # Note that generated test targets are prefixed with their test suite name. |
| bazel test :e2e_tests_failing_broadcasting_test__tf__iree_vulkan |
| ``` |
| |
| ## Debugging tests |
| |
| If the compiler fails to compile the program, then it will create a crash |
| reproducer (see [MLIR documentation](https://mlir.llvm.org/docs/WritingAPass/)), |
| which then allows reproducing the bug with an appropriate "opt" tool. Further |
| debugging iteration can happen in opt. |
| |
| TODO(silvasean): debugging miscompiles |
| |
| ## Test harnesses |
| |
| ### Simple function tests |
| |
| See `simple_arithmetic_test.py` for some basic examples. |