OpenTitan is not a closed ecosystem: we incorporate code from third parties, and we split out pieces of our code to reach a wider audience. In both cases, we need to import and use code from external repositories in our OpenTitan code base. Read on for step-by-step instructions for common tasks, and for background information on the topic.
Code in subdirectories of hw/vendor
is imported (copied in) from external repositories (which may be provided by lowRISC or other sources). The external repository is called “upstream”. Any development on imported in hw/vendor
code should happen upstream when possible. Files ending with .vendor.hjson
indicate where the upstream repository is located.
In particular, this means:
vendor_hw
tool can be used to copy the upstream code back into our OpenTitan repository.Read on for the longer version of these guidelines.
Pushing changes upstream first isn't always possible or desirable: upstream might not accept changes, or be slow to respond. In some cases, code changes are needed which are irrelevant for upstream and need to be maintained by us. Our vendoring infrastructure is able to handle such cases, read on for more information on how to do it.
OpenTitan is developed in a “monorepo”, a single repository containing all its source code. This approach is beneficial for many reasons, ranging from an easier workflow to better reproducibility of the results, and that's why large companies like Google and Facebook are using monorepos. Monorepos are even more compelling for hardware development, which cannot make use of a standardized language-specific package manager like npm or pip.
At the same time, open source is all about sharing and a free flow of code between projects. We want to take in code from others, but also to give back and grow a wider ecosystem around our output. To be able to do that, code repositories should be sufficiently modular and self-contained. For example, if a CPU core is buried deep in a repository containing a full SoC design, people will have a hard time using this CPU core for their designs and contributing to it.
Our approach to this challenge: develop reusable parts of our code base in an external repository, and copy the source code back into our monorepo in an automated way. The process of copying in external code is commonly called “vendoring”.
Vendoring code is a good thing. We continue to maintain a single code base which is easy to fork, tag and generally work with, as all the normal Git tooling works. By explicitly importing code we also ensure that no unreviewed code sneaks into our code base, and a “always buildable” configuration is maintained.
But what happens if the imported code needs to be modified? Ideally, all code changes are submitted upstream, integrated into the upstream code base, and then re-imported into our code base. This development methodology is called “upstream first”. History has shown repeatedly that an upstream first policy can help significantly with the long-term maintenance of code.
However, strictly following an upstream first policy isn't great either. Some changes might not be useful for the upstream community, others might be not acceptable upstream or only applied after a long delay. In these situations it must be possible to modify the code downstream, i.e. in our repository, as well. Our setup includes multiple options to achieve this goal. In many cases, applying patches on top of the imported code is the most sustainable option.
To ease the pain points of vendoring code we have developed tooling and continue to do so. Please open an issue ticket if you see areas where the tooling could be improved.
This section gives a quick overview how we include code from other repositories into our repository.
All imported (“vendored”) hardware code is by convention put into the hw/vendor
directory. (We have more conventions for file and directory names which are discussed below when the import of new code is described.) To interact with code in this directory a tool called vendor_hw
is used, which can be found in util/vendor_hw.py
. A “vendor description file” controls the vendoring process and serves as input to the vendor_hw
tool.
In the simple, yet typical, case, the vendor description file is only a couple of lines of human-readable JSON:
$ cat hw/vendor/lowrisc_ibex.vendor.hjson { name: "lowrisc_ibex", target_dir: "lowrisc_ibex", upstream: { url: "https://github.com/lowRISC/ibex.git", rev: "master", }, }
This description file essentially says: We vendor a component called “lowrisc_ibex” and place the code into the “lowrisc_ibex” directory (relative to the description file). The code comes from the master branch of the Git repository found at https://github.com/lowRISC/ibex.git.
With this description file written, the vendor_hw
tool can do its job.
$ cd $REPO_TOP $ ./util/vendor_hw.py hw/vendor/lowrisc_ibex.vendor.hjson --verbose INFO: Cloning upstream repository https://github.com/lowRISC/ibex.git @ master INFO: Cloned at revision 7728b7b6f2318fb4078945570a55af31ee77537a INFO: Copying upstream sources to /home/philipp/src/opentitan/hw/vendor/lowrisc_ibex INFO: Changes since the last import: * Typo fix in muldiv: Reminder->Remainder (Stefan Wallentowitz) INFO: Wrote lock file /home/philipp/src/opentitan/hw/vendor/lowrisc_ibex.lock.hjson INFO: Import finished
Looking at the output, you might wonder: how did the vendor_hw
tool know what changed since the last import? It knows because it records the commit hash of the last import in a file called the “lock file”. This file can be found along the .vendor.hjson
file, it's named .lock.hjson
.
In the example above, it looks roughly like this:
$ cat hw/vendor/lowrisc_ibex.lock.hjson { upstream: { url: https://github.com/lowRISC/ibex.git rev: 7728b7b6f2318fb4078945570a55af31ee77537a } }
The lock file should be committed together with the code itself to make the import step reproducible at any time.
After running vendor_hw
, the code in your local working copy is updated to the latest upstream version. Next is testing: run simulations, syntheses, or other tests to ensure that the new code works as expected. Once you're confident that the new code is good to be committed, do so using the normal Git commands.
$ cd $REPO_TOP $ # Stage all files in the vendored directory $ git add -A hw/vendor/lowrisc_ibex $ # Stage the lock file as well $ git add hw/vendor/lowrisc_ibex.lock.hjson $ # Now commit everything. Don't forget to write a useful commit message! $ git commit
Instead of running vendor_hw
first, and then manually creating a Git commit, you can also use the --commit
flag.
$ cd $REPO_TOP $ ./util/vendor_hw.py hw/vendor/lowrisc_ibex.vendor.hjson --verbose --commit
This command updates the “lowrisc_ibex” code, and creates a Git commit from it.
Read on for a complete example how to efficiently update a vendored dependency, and how to make changes to such code.
A complete example to update a vendored dependency, commit its changes, and create a pull request from it, is given below.
$ cd $REPO_TOP $ # Ensure a clean working directory $ git stash $ # Create a new branch for the pull request $ git checkout -b update-ibex-code upstream/master $ # Update lowrisc_ibex and create a commit $ ./util/vendor_hw.py hw/vendor/lowrisc_ibex.vendor.hjson --verbose --commit $ # Push the new branch to your fork $ git push origin update-ibex-code $ # Restore changes in working directory (if anything was stashed before) $ git stash pop
Now go to the GitHub web interface to open a Pull Request for the update-ibex-code
branch.
Open the vendor description file (.vendor.hjson
) of the dependency you want to update and take note of the url
and the branch
in the upstream
section.
Clone the upstream repository and switch to the used branch:
$ # Go to your source directory (can be anywhere) $ cd ~/src $ # Clone the repository and switch the branch. Below is an example for ibex. $ git clone https://github.com/lowRISC/ibex.git $ cd ibex $ git checkout master
After this step you're ready to make your modifications. You can do so either directly in the upstream repository, or start in the OpenTitan repository.
The easiest option is to modify the upstream repository directly as usual.
Most changes to external code are motivated by our own needs. Modifying the external code directly in the hw/vendor
directory is therefore a sensible starting point.
Make your changes in the OpenTitan repository. Do not commit them.
Create a patch with your changes. The example below uses lowrisc_ibex
.
$ cd hw/vendor/lowrisc_ibex $ git diff --relative . > changes.patch
Take note of the revision of the imported repository from the lock file.
$ cat hw/vendor/lowrisc_ibex.lock.hjson | grep rev rev: 7728b7b6f2318fb4078945570a55af31ee77537a
Switch to the checked out upstream repository and bring it into the same state as the imported repository. Again, the example below uses ibex, adjust as needed.
# Change to the upstream repository $ cd ~/src/ibex $ # Create a new branch for your patch $ # Use the revision you determined in the previous step! $ git checkout -b modify-ibex-somehow 7728b7b6f2318fb4078945570a55af31ee77537a $ git apply -p1 < $REPO_BASE/hw/vendor/lowrisc_ibex/changes.patch $ # Add and commit your changes as usual $ # You can create multiple commits with git add -p and committing $ # multiple times. $ git add -u $ git commit
You have now created a commit in the upstream repository. Before submitting your changes upstream, rebase them on top of the upstream development branch, typically master
, and ensure that all tests pass. Now you need to follow the upstream guidelines on how to get the change accepted. In many cases their workflow is similar to ours: push your changes to a repository fork on your namespace, create a pull request, work through review comments, and update it until the change is accepted and merged.
After your change is accepted upstream, you can update our copy of the code using the vendor_hw
tool as described before.
Vendoring external code is done by creating a vendor description file, and then running the vendor_hw
tool.
Create a vendor description file for the new dependency.
Make note of the Git repository and the branch you want to vendor in.
Choose a name for the external dependency. It is recommended to use the format <vendor>_<name>
. Typically <vendor>
is the lower-cased user or organization name on GitHub, and <name>
is the lower-cased project name.
Choose a target directory. It is recommended use the dependency name as directory name.
Create the vendor description file in hw/vendor/<vendor>_<name>.vendor.hjson
with the following contents (adjust as needed):
// Copyright lowRISC contributors. // Licensed under the Apache License, Version 2.0, see LICENSE for details. // SPDX-License-Identifier: Apache-2.0 { name: "lowrisc_ibex", target_dir: "lowrisc_ibex", upstream: { url: "https://github.com/lowRISC/ibex.git", rev: "master", }, }
Create a new branch for a subsequent pull request
$ git checkout -b vendor-something upstream/master
Commit the vendor description file
$ git add hw/vendor/<vendor>_<name>.vendor.hjson $ git commit
Run the vendor_hw
tool for the newly vendored code.
$ cd $REPO_TOP $ ./util/vendor_hw.py hw/vendor/lowrisc_ibex.vendor.hjson --verbose --commit
Push the branch to your fork for review (assuming origin
is the remote name of your fork).
$ git push -u origin vendor-something
Now go the GitHub web interface to create a Pull Request for the newly created branch.
You can exclude files from the upstream code base by listing them in the vendor description file under exclude_from_upstream
. Glob-style wildcards are supported (*
, ?
, etc.), as known from shells.
Example:
// section of a .vendor.hjson file exclude_from_upstream: [ // exclude all *.h files in the src directory "src/*.h*", // exclude the src_files.yml file "src_files.yml", // exclude some_directory and all files below it "some_directory", ]
In some cases the upstream code must be modified before it can be used. For this purpose, the vendor_hw
tool can apply patches on top of imported code. The patches are kept as separate files in our repository, making it easy to understand the differences to the upstream code, and to switch the upstream code to a newer version.
To apply patches on top of vendored code, do the following:
Extend the .vendor.hjson
file of the dependency and add a patch_dir
line pointing to a directory of patch files. It is recommended to place patches into the patches/<vendor>_<name>
directory.
patch_dir: "patches/lowrisc_ibex",
Place patch files with a .patch
suffix in the patch_dir
.
When running vendor_hw
, patches are applied on top of the imported code according to the following rules.
0001-do-someting.patch
to apply them in a deterministic order.-p1
argument of patch
.git apply
, making all extended features of Git patches available (e.g. renames).Managing patch series on top of code can be challenging. As the underlying code changes, the patches need to be refreshed to continue to apply. Adding new patches is a very manual process. And so on.
Fortunately, Git can be used to simplify this task. The idea:
vendor_hw
tool can find and apply them.The last step is automated by the vendor_hw
tool through its --refresh-patches
argument.
Modify the vendor description file to add a patch_repo
section.
url
parameter specifies the URL to the fork of the upstream repository containing all modifications.rev_base
is the base revision, typically the master
branch.rev_patched
is the patched revision, typically the name of the branch with your changes.patch_repo: { url: "https://github.com/lowRISC/riscv-dbg.git", rev_base: "master", rev_patched: "changes", },
Create commit and push to the forked repository. Make sure to push both branches to the fork: rev_base
and rev_patched
. In the example above, this would be (with REMOTE_NAME_FORK
being the remote name of the fork):
git push REMOTE_NAME_FORK master changes
Run the vendor_hw
tool with the --refresh-patches
argument. It will first check out the patch repository and convert all commits which are in the rev_patched
branch and not in the rev_base
branch into patch files. These patch files are then stored in the patch directory. After that, the vendoring process continues as usual: all patches are applied and if instructed by the --commit
flag, a commit is created. This commit now also includes the updated patch files.
To update the patches you can use all the usual Git tools in the forked repository.
git rebase
to refresh them on top of changes in the upstream repository.rev_patched
fork.git rebase -i
).It is important to update and push both branches in the forked repository: the rev_base
branch and the rev_patched
branch. Use git log rev_base..rev_patched
(replace rev_base
and rev_patched
as needed) to show all commits which will be turned into patches.