blob: c9360a183b2628b23c8e009f3e97fff70d27d248 [file] [log] [blame] [view]
# IREE Repository Management
Due to the process by which we synchronize this GitHub project with our internal
Google source code repository, there are some oddities in our workflows and
processes. We aim to minimize these, and especially to minimize their impact on
external contributors, but they are documented here for clarity and
transparency. If any of these things are particularly troublesome or painful for
your workflow, please reach out to us so we can prioritize a fix.
## Branches
The default branch is called `main`. PRs should be sent there. We also have a
`google` branch that is synced from the Google internal source repository. This
branch is merged from and to `main` frequently to upstream any changes coming
from `google` and as part of updating the internal source repository. For the
most part this integration with `google` should have minimal effect on normal
developer workflows.
## Dependencies
As a project which brings together compiler, runtime and graphics systems,
dependency management is somewhat complex. We use git submodules for C++
dependencies and, where possible, language specific external package management
for other types of dependencies (e.g. Python).
In addition, dependencies are managed entirely different in the Google internal
source code repository. This imposes constraints on repository management tasks
that may not be obvious.
The git submodule state is not accessible to the tooling that moves the IREE
source between GitHub and Google's source repository. We therefore mirror this
state in a special `SUBMODULE_VERSIONS.txt` file. This file is considered the
source of truth for the correct submodule state and is what is used by the CI.
When updating the submodule state from Google's source repository, only this
file is updated and another
[GitHub Actions workflow](https://github.com/google/iree/blob/main/.github/workflows/synchronize_submodules.yml)
immediately commits a submodule update on top of that.
Shortcut commands (read below for full documentation):
```shell
# Update SUBMODULE_VERSIONS.txt from current git submodule pointers
$ ./scripts/git/submodule_versions.py export
# Update current git submodule pointers based on SUBMODULE_VERSIONS.txt
$ ./scripts/git/submodule_versions.py import
```
### The special relationship with LLVM and TensorFlow
TL;DR: Dependency management with TensorFlow and LLVM are managed as part of the
integration with Google's internal source repository and updates will be picked
up by the `main` branch at an approximately daily cadence when merging in the
`google` branch.
Currently, the two most challenging projects to manage as dependencies are
TensorFlow and LLVM. Both are typically pinned to specific versions that are
integrated in the Google source repository at a cadence up to many times per
day. Additionally, TensorFlow itself depends on LLVM. We use BUILD files in the
utils/bazel directory in the llvm-project repository. Just to make it more
interesting, since TensorFlow does not ship with CMakeLists, IREE uses overlay
CMakeLists.txt to build subsets of TensorFlow needed for the compiler (when
built with CMake). While these externally managed build files are written to be
moderately generic, they can and do break and require manual intervention at
times (i.e. there is no guarantee that updating to a new commit of either will
not require some manual work on the build files).
The only combination which is expected to work is the llvm-project commit noted
in the `LLVM_COMMIT` setting in
`third_party/tensorflow/tensorflow/workspace.bzl`. In reality, it is often
possible (especially with the CMake build) to use a newer llvm-project commit,
and private development forks should feel free to do this. However, to submit
upstream, we will need to integrate this newer commit in the internal Google
source of truth prior to accepting code that depends on it.
LLVM is integrated into Google's source repository at a cadence up to several
times a day. The LLVM submodule version used by IREE is also updated as part of
this integration. When this necessitates updating IREE to changing LLVM and MLIR
APIs, these updates are performed atomically as part of the integration. However
the changes performed here only guarantee that the internal build is not broken,
which means that such integrations may very well break the OSS build on the
`google` branch (in particular, it does not update the LLVM BUILD files). The
IREE build cop will fix the breakage on the `google` branch. We are continuing
to work on improving this situation.
### Tasks:
#### Adding dependencies
In general, adding dependencies will require coordination with Google engineers
to make the dependency available both on GitHub and in the Google source
repository. It is important to ask on the mailing list prior to expecting to
contribute such changes.
#### Pushing dependency changes
When working on a development branch, feel free to stage changes however makes
sense. However, when sending a PR, note that our integration systems will
overwrite the submodule version updates from the `SUBMODULE_VERSIONS.txt` file in
the repository root. Here is an example:
```text
309de5988eb949a27e077a24a1d83c0687d10d57 third_party/benchmark
97f3aa91746a7d207513a73725e92cee7c35bb87 third_party/flatbuffers
f2fb48c3b3d79a75a88a99fba6576b25d42ec528 third_party/googletest
a21beccea2020f950845cbb68db663d0737e174c third_party/llvm-project
80d452484c5409444b0ec19383faa84bb7a4d351 third_party/pybind11
b252a50953ac4375cb1864e94f4b0234db9d215d third_party/spirv_headers
6652f0b6428777b5a4a3d191cc30d8b31366b999 third_party/swiftshader
9b32b2db1142166b190ea30757f48d8dd0fb00e3 third_party/tensorflow
ba091ba6a947f79623b28fe8bfccdce1ab9fa467 third_party/vulkan_headers
909f36b714c9239ee0b112a321220213a474ba53 third_party/vulkan_memory_allocator
```
If bumping versions, you must include an update to this file with your commit.
To generate it, run:
```shell
# Performs a submodule sync+update and stages an updated SUBMODULE_VERSIONS.txt
# file.
$ ./scripts/git/submodule_versions.py export
```
If you don't know if this is required, you may run:
```shell
# The check command is intended to eventually be usable as a git hook
# for verification of consistency between SUBMODULE_VERSIONS.txt and the
# corresponding local git state.
$ ./scripts/git/submodule_versions.py check
```
#### Pulling dependency changes
If you pull a change to `SUBMODULE_VERSIONS.txt` it is necessary to import it into
the current git state:
```shell
# Updates the commit hash of any entries in SUBMODULE_VERSIONS.txt that differ
# and stages the changes.
$ ./scripts/git/submodule_versions.py import
```
This will stage any needed changes to the submodules to bring them up to date
with the `SUBMODULE_VERSIONS.txt`. If you have local changes, you may get conflicts
on the `submodule update` step at the end, and you will need to manually resolve
as usual.
Any CI systems or other things that retrieve an arbitrary commit should invoke
this prior to running just to make sure that their git view of the submodule
state is consistent.