| # IREE Buildkite Automation |
| |
| ## Status |
| |
| We are in the process of migrating our automation infrastructure to |
| [Buildkite](https://buildkite.com). `samples.yml` and everything under the |
| `cmake/` directory are legacy pipelines from our previous ad hoc usage. This |
| document describes everything else, which is still in development and can |
| generally be ignored for now. |
| |
| ## General Setup |
| |
| All IREE automation pipelines are moving to Buildkite. The jobs in these |
| pipelines should avoid duplicating work and instead pass artifacts between |
| machines. The workhorse of this automation is a collection of x86-64 linux build |
| machines (initially GCE VMs but probably moving to Kubernetes containers) which |
| perform all direct and cross compilation that is feasible. In cases where we |
| want to ensure compilation on a platform, not just for it (and/or |
| cross-compiling for it on Linux is difficult), we will also run build machines |
| of that platform (e.g. Windows). |
| |
| Built artifacts will be farmed out to testing machines (some machines may play |
| double-duty as build and test), using docker images and emulators where useful. |
| Bazel builds will use |
| [Bazel remote caching](https://bazel.build/docs/remote-caching), but not remote |
| execution which we have found to be prohibitevly expensive to configure and |
| maintain. CMake builds will initially be uncached, but move to first local |
| caches (ccache) and then remote caching and/or distributed builds (more research |
| needed, but e.g. sccache or distcc). |
| |
| Orchestration agents will take care of light tasks like uploading Buildkite |
| pipelines and polling for bigger jobs to be finished. These will be run on |
| minimally sized machines (initially e2-micro GCE VMs, but likely moving to |
| Kubernetes containers). |
| |
| ## Presubmits |
| |
| There is a strict separation between machines and caches that act on code |
| that has been submitted to the IREE repository and code that is coming from |
| third party forks. Buildkite agents are tagged with `security: "submitted"` or |
| `security: "presubmit"` to indicate the class of code that they run. Agents |
| running unsubmitted code may have read access to artifacts (e.g. cache build |
| cache entries) generated by agents running submitted code, but not write access. |
| |
| ### Vetting Unsubmitted Code |
| |
| There's no getting around that presubmit testing is remote code execution (even |
| if there's no confidential information accessible by those remote executors), |
| and we don't want to give malicious or nuisance actors free compute. |
| At the same time, we want to have minimal friction for new contributors, and |
| especially for routine contributors, *indepenedent of what company they work |
| for*. As a proxy for "real person who actually wants to contribute to the |
| project" we use having signed the |
| [Google Contributor License Agreement (CLA)](https://cla.developers.google.com), |
| which is already required to contribute to the project. This is checked using |
| already-checked-in scripts and configs before running anything using the code |
| from a pull request. Because CLA signing can sometimes create a roadblock |
| (especially in the case of corporate CLAs, which require getting lawyers |
| involved), if the CLA check fails, a |
| [block step](https://buildkite.com/docs/pipelines/block-step) is inserted in the |
| pipeline, allowing members of the IREE Buildkite org to unblock the runs |
| manually in the meantime. Additionally we block any bad actors that crop up |
| using Buildkite |
| [conditional filtering](https://buildkite.com/docs/pipelines/conditionals#conditionals-in-pipelines) |
| to stop builds from triggering on their PRs at all. IREE Buildkite organization |
| admins can |
| [update those options here](https://buildkite.com/iree/presubmit/settings/repository#:~:text=Filter%20builds%20using%20a%20conditional). |
| |
| The [Presubmit pipeline](https://buildkite.com/iree/presubmit) runs on all PRs |
| sent to the main repository. The basic flow is that |
| [presubmit_bootstrap.yml](presubmit_bootstrap.yml) fetches and uploads |
| [presubmit.yml](presubmit.yml) from the main branch of the repository, which |
| similarly fetches [check_cla.py](check_cla.py) from the main branch and checks |
| whether the CLA check has passed on the target commit. If it hasn't, it inserts |
| a block step that prevents further execution until an authorized person in the |
| IREE Buildkite organization unblocks. Subsequent steps fetch |
| [wait_for_pipeline_success.py](wait_for_pipeline_success.py) from the main |
| branch and use it to trigger and wait for other pipelines to execute on the PR |
| commit. |
| |
| ## Postsubmits |
| |
| Since it doesn't have to deal with potentially untrusted code, the |
| [Postsubmit pipeline](https://buildkite.com/iree/postsubmit) is much simpler. It |
| triggers on each commit to the `main` branch. |
| [postsubmit_bootstrap.yml](postsubmit_bootstrap.yml) fetches the main branch and |
| uploads [postsubmit.yml](postsubmit.yml), which triggers and waits for all the |
| specified pipelines to complete on the target commit. |
| |
| ## Triggering Multiple Runs |
| |
| Both the presubmit and postsubmit pipelines are designed to be idempotent. |
| Triggering them again on the same commit will not trigger any new builds, only |
| orchestration pipelines. This is one of the reasons we use our own script rather |
| than a Buildkite trigger step. |