Adding Vulkan sparse binding buffer support for native allocations. (#14536)

This creates one logical VkBuffer that is backed by as many aligned
max-size allocations as required. There's a lot we could tweak here and
a lot to optimize but the initial proof of concept here is specifically
for allowing large constant/variable buffers with long lifetimes. Most
implementations don't allow using these buffers with dispatches, though,
due to embarrassingly and arbitrarily small limits on shader storage
buffer access ranges. We'll need device pointers to actually use these
but at least we can allocate them now.

Future changes will add asynchronous binding and sparse residency as
part of the HAL API so that targets supporting constrained virtual
memory management (CPU, CUDA, Vulkan, etc) can have such
virtual/physical remapping exposed for use by the compiler. When that's
implemented the sparse buffer type here will be reworked as a shared
utility implementation using the binding/sparse residency APIs.

In order for this to be used for large constants host allocation
importing was implemented so that the buffers can be transferred. This
required a change in the HAL APIs exposed to the compiler as what was
there was a hack to approximate the proper import/mapping path but
insufficient for doing it properly. This has been tested with imports of
up to 15GB (and should work beyond that, device memory allowing).

On discrete systems when the module is mmapped we can't import and stage
in chunks:

![image](https://github.com/openxla/iree/assets/75337/951568e9-5cdb-4a2a-95c1-05a8d371066c)

If not mmapped we can import the host pointer as a staging source and
avoid the chunk allocation:

![image](https://github.com/openxla/iree/assets/75337/2d87982e-e98f-4e4c-a3d0-e226f72717a6)

On unified memory systems we can (sometimes) directly use the host
buffer and avoid all allocations:

![image](https://github.com/openxla/iree/assets/75337/3eb51285-3270-4b7a-a88c-240ca4312287)

Progress on #14607.
Fixes #7242.
17 files changed
tree: 79e6ab11af4defc4ea61c5dde3ddaf79454eb828
  1. .devcontainer/
  2. .github/
  3. build_tools/
  4. compiler/
  5. docs/
  6. experimental/
  7. integrations/
  8. lib/
  9. llvm-external-projects/
  10. runtime/
  11. samples/
  12. tests/
  13. third_party/
  14. tools/
  15. .bazel_to_cmake.cfg.py
  16. .bazelignore
  17. .bazelrc
  18. .bazelversion
  19. .clang-format
  20. .dockerignore
  21. .git-blame-ignore-revs
  22. .gitignore
  23. .gitmodules
  24. .yamllint.yml
  25. AUTHORS
  26. BUILD.bazel
  27. CITATION.cff
  28. CMakeLists.txt
  29. configure_bazel.py
  30. CONTRIBUTING.md
  31. LICENSE
  32. README.md
  33. WORKSPACE
README.md

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.

See our website for project details, user guides, and instructions on building from source.

CI Status

Project Status

IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!

Communication Channels

Related Project Channels

  • MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Architecture Overview

IREE Architecture IREE Architecture

See our website for more information.

Presentations and Talks

  • 2021-06-09: IREE Runtime Design Tech Talk (recording and slides)
  • 2020-08-20: IREE CodeGen: MLIR Open Design Meeting Presentation (recording and slides)
  • 2020-03-18: Interactive HAL IR Walkthrough (recording)
  • 2020-01-31: End-to-end MLIR Workflow in IREE: MLIR Open Design Meeting Presentation (recording and slides)

License

IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.