commit	c4d76f278d52485bc840e2effeaf865d98c36254	[log] [tgz]
author	Ben Vanik <ben.vanik@gmail.com>	Fri Aug 18 12:06:17 2023 -0700
committer	GitHub <noreply@github.com>	Fri Aug 18 12:06:17 2023 -0700
tree	79e6ab11af4defc4ea61c5dde3ddaf79454eb828
parent	73ddcce258c9be653be0997b25bef1dcb3d0ddbd [diff]

Adding Vulkan sparse binding buffer support for native allocations. (#14536)

This creates one logical VkBuffer that is backed by as many aligned
max-size allocations as required. There's a lot we could tweak here and
a lot to optimize but the initial proof of concept here is specifically
for allowing large constant/variable buffers with long lifetimes. Most
implementations don't allow using these buffers with dispatches, though,
due to embarrassingly and arbitrarily small limits on shader storage
buffer access ranges. We'll need device pointers to actually use these
but at least we can allocate them now.

Future changes will add asynchronous binding and sparse residency as
part of the HAL API so that targets supporting constrained virtual
memory management (CPU, CUDA, Vulkan, etc) can have such
virtual/physical remapping exposed for use by the compiler. When that's
implemented the sparse buffer type here will be reworked as a shared
utility implementation using the binding/sparse residency APIs.

In order for this to be used for large constants host allocation
importing was implemented so that the buffers can be transferred. This
required a change in the HAL APIs exposed to the compiler as what was
there was a hack to approximate the proper import/mapping path but
insufficient for doing it properly. This has been tested with imports of
up to 15GB (and should work beyond that, device memory allowing).

On discrete systems when the module is mmapped we can't import and stage
in chunks:

![image](https://github.com/openxla/iree/assets/75337/951568e9-5cdb-4a2a-95c1-05a8d371066c)

If not mmapped we can import the host pointer as a staging source and
avoid the chunk allocation:

![image](https://github.com/openxla/iree/assets/75337/2d87982e-e98f-4e4c-a3d0-e226f72717a6)

On unified memory systems we can (sometimes) directly use the host
buffer and avoid all allocations:

![image](https://github.com/openxla/iree/assets/75337/3eb51285-3270-4b7a-a88c-240ca4312287)

Progress on #14607.
Fixes #7242.

17 files changed

tree: 79e6ab11af4defc4ea61c5dde3ddaf79454eb828

README.md

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.

See our website for project details, user guides, and instructions on building from source.

Project Status

IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!

Communication Channels

GitHub issues: Feature requests, bugs, and other work tracking
IREE Discord server: Daily development discussions with the core team and collaborators
iree-discuss email list: Announcements, general and low-priority discussion

Related Project Channels

MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Architecture Overview

IREE Architecture

See our website for more information.

Presentations and Talks

2021-06-09: IREE Runtime Design Tech Talk (recording and slides)
2020-08-20: IREE CodeGen: MLIR Open Design Meeting Presentation (recording and slides)
2020-03-18: Interactive HAL IR Walkthrough (recording)
2020-01-31: End-to-end MLIR Workflow in IREE: MLIR Open Design Meeting Presentation (recording and slides)

License

IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.