blob: 8e332155621a729c9f499cbab3ef4508f38aaf06 [file] [log] [blame] [view]
Philipp Wagnerc120b182020-09-17 11:53:03 +01001---
2title: Work with hardware code in external repositories
3---
lowRISC Contributors802543a2019-08-31 12:12:56 +01004
5OpenTitan is not a closed ecosystem: we incorporate code from third parties, and we split out pieces of our code to reach a wider audience.
6In both cases, we need to import and use code from external repositories in our OpenTitan code base.
7Read on for step-by-step instructions for common tasks, and for background information on the topic.
8
9## Summary
10
11Code in subdirectories of `hw/vendor` is imported (copied in) from external repositories (which may be provided by lowRISC or other sources).
12The external repository is called "upstream".
13Any development on imported in `hw/vendor` code should happen upstream when possible.
14Files ending with `.vendor.hjson` indicate where the upstream repository is located.
15
16In particular, this means:
17
18- If you find a bug in imported code or want to enhance it, report it upstream.
19- Follow the rules and style guides of the upstream project.
20 They might differ from our own rules.
21- Use the upstream mechanisms to do code changes. In many cases, upstream uses GitHub just like we do with Pull Requests.
22- Work with upstream reviewers to get your changes merged into their code base.
Miguel Osorioba17f112019-11-15 12:30:14 -080023- Once the change is part of the upstream repository, the `util/vendor` tool can be used to copy the upstream code back into our OpenTitan repository.
lowRISC Contributors802543a2019-08-31 12:12:56 +010024
25Read on for the longer version of these guidelines.
26
27Pushing changes upstream first isn't always possible or desirable: upstream might not accept changes, or be slow to respond.
28In some cases, code changes are needed which are irrelevant for upstream and need to be maintained by us.
29Our vendoring infrastructure is able to handle such cases, read on for more information on how to do it.
30
31## Background
32
33OpenTitan is developed in a "monorepo", a single repository containing all its source code.
34This approach is beneficial for many reasons, ranging from an easier workflow to better reproducibility of the results, and that's why large companies like [Google](https://ai.google/research/pubs/pub45424) and Facebook are using monorepos.
35Monorepos are even more compelling for hardware development, which cannot make use of a standardized language-specific package manager like npm or pip.
36
37At the same time, open source is all about sharing and a free flow of code between projects.
38We want to take in code from others, but also to give back and grow a wider ecosystem around our output.
39To be able to do that, code repositories should be sufficiently modular and self-contained.
40For example, if a CPU core is buried deep in a repository containing a full SoC design, people will have a hard time using this CPU core for their designs and contributing to it.
41
42Our approach to this challenge: develop reusable parts of our code base in an external repository, and copy the source code back into our monorepo in an automated way.
43The process of copying in external code is commonly called "vendoring".
44
45Vendoring code is a good thing.
46We continue to maintain a single code base which is easy to fork, tag and generally work with, as all the normal Git tooling works.
47By explicitly importing code we also ensure that no unreviewed code sneaks into our code base, and a "always buildable" configuration is maintained.
48
49But what happens if the imported code needs to be modified?
50Ideally, all code changes are submitted upstream, integrated into the upstream code base, and then re-imported into our code base.
Miguel Osorioba17f112019-11-15 12:30:14 -080051This development methodology is called "upstream first".
lowRISC Contributors802543a2019-08-31 12:12:56 +010052History has shown repeatedly that an upstream first policy can help significantly with the long-term maintenance of code.
53
54However, strictly following an upstream first policy isn't great either.
55Some changes might not be useful for the upstream community, others might be not acceptable upstream or only applied after a long delay.
56In these situations it must be possible to modify the code downstream, i.e. in our repository, as well.
57Our setup includes multiple options to achieve this goal.
58In many cases, applying patches on top of the imported code is the most sustainable option.
59
60To ease the pain points of vendoring code we have developed tooling and continue to do so.
61Please open an issue ticket if you see areas where the tooling could be improved.
62
63## Basic concepts
64
65This section gives a quick overview how we include code from other repositories into our repository.
66
67All imported ("vendored") hardware code is by convention put into the `hw/vendor` directory.
68(We have more conventions for file and directory names which are discussed below when the import of new code is described.)
Miguel Osorioba17f112019-11-15 12:30:14 -080069To interact with code in this directory a tool called `util/vendor.py` is used.
70A "vendor description file" controls the vendoring process and serves as input to the `util/vendor` tool.
lowRISC Contributors802543a2019-08-31 12:12:56 +010071
72In the simple, yet typical, case, the vendor description file is only a couple of lines of human-readable JSON:
73
74```command
75$ cat hw/vendor/lowrisc_ibex.vendor.hjson
76{
77 name: "lowrisc_ibex",
78 target_dir: "lowrisc_ibex",
79
80 upstream: {
81 url: "https://github.com/lowRISC/ibex.git",
82 rev: "master",
83 },
84}
85```
86
87This description file essentially says:
88We vendor a component called "lowrisc_ibex" and place the code into the "lowrisc_ibex" directory (relative to the description file).
Scott Johnsonfe79c4b2020-07-08 10:31:08 -070089The code comes from the `master` branch of the Git repository found at https://github.com/lowRISC/ibex.git.
lowRISC Contributors802543a2019-08-31 12:12:56 +010090
Miguel Osorioba17f112019-11-15 12:30:14 -080091With this description file written, the `util/vendor` tool can do its job.
lowRISC Contributors802543a2019-08-31 12:12:56 +010092
93```command
94$ cd $REPO_TOP
Sam Elliotta24497b2020-04-23 15:06:46 +010095$ ./util/vendor.py hw/vendor/lowrisc_ibex.vendor.hjson --verbose --update
lowRISC Contributors802543a2019-08-31 12:12:56 +010096INFO: Cloning upstream repository https://github.com/lowRISC/ibex.git @ master
97INFO: Cloned at revision 7728b7b6f2318fb4078945570a55af31ee77537a
98INFO: Copying upstream sources to /home/philipp/src/opentitan/hw/vendor/lowrisc_ibex
99INFO: Changes since the last import:
100* Typo fix in muldiv: Reminder->Remainder (Stefan Wallentowitz)
101INFO: Wrote lock file /home/philipp/src/opentitan/hw/vendor/lowrisc_ibex.lock.hjson
102INFO: Import finished
103```
104
Miguel Osorioba17f112019-11-15 12:30:14 -0800105Looking at the output, you might wonder: how did the `util/vendor` tool know what changed since the last import?
lowRISC Contributors802543a2019-08-31 12:12:56 +0100106It knows because it records the commit hash of the last import in a file called the "lock file".
107This file can be found along the `.vendor.hjson` file, it's named `.lock.hjson`.
108
109In the example above, it looks roughly like this:
110
111```command
112$ cat hw/vendor/lowrisc_ibex.lock.hjson
113{
114 upstream:
115 {
116 url: https://github.com/lowRISC/ibex.git
117 rev: 7728b7b6f2318fb4078945570a55af31ee77537a
118 }
119}
120```
121
122The lock file should be committed together with the code itself to make the import step reproducible at any time.
Sam Elliotta24497b2020-04-23 15:06:46 +0100123This import step can be reproduced by running the `util/vendor` tool without the `--update` flag.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100124
Miguel Osorioba17f112019-11-15 12:30:14 -0800125After running `util/vendor`, the code in your local working copy is updated to the latest upstream version.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100126Next is testing: run simulations, syntheses, or other tests to ensure that the new code works as expected.
127Once you're confident that the new code is good to be committed, do so using the normal Git commands.
128
129```command
130$ cd $REPO_TOP
131
132$ # Stage all files in the vendored directory
133$ git add -A hw/vendor/lowrisc_ibex
134
135$ # Stage the lock file as well
136$ git add hw/vendor/lowrisc_ibex.lock.hjson
137
138$ # Now commit everything. Don't forget to write a useful commit message!
139$ git commit
140```
141
Miguel Osorioba17f112019-11-15 12:30:14 -0800142Instead of running `util/vendor` first, and then manually creating a Git commit, you can also use the `--commit` flag.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100143
144```command
145$ cd $REPO_TOP
Sam Elliotta24497b2020-04-23 15:06:46 +0100146$ ./util/vendor.py hw/vendor/lowrisc_ibex.vendor.hjson \
147 --verbose --update --commit
lowRISC Contributors802543a2019-08-31 12:12:56 +0100148```
149
150This command updates the "lowrisc_ibex" code, and creates a Git commit from it.
151
152Read on for a complete example how to efficiently update a vendored dependency, and how to make changes to such code.
153
154## Update vendored code in our repository
155
156A complete example to update a vendored dependency, commit its changes, and create a pull request from it, is given below.
157
158```command
159$ cd $REPO_TOP
160$ # Ensure a clean working directory
161$ git stash
162$ # Create a new branch for the pull request
163$ git checkout -b update-ibex-code upstream/master
164$ # Update lowrisc_ibex and create a commit
Sam Elliotta24497b2020-04-23 15:06:46 +0100165$ ./util/vendor.py hw/vendor/lowrisc_ibex.vendor.hjson \
166 --verbose --update --commit
lowRISC Contributors802543a2019-08-31 12:12:56 +0100167$ # Push the new branch to your fork
168$ git push origin update-ibex-code
169$ # Restore changes in working directory (if anything was stashed before)
170$ git stash pop
171```
172
173Now go to the GitHub web interface to open a Pull Request for the `update-ibex-code` branch.
174
175## How to modify vendored code (fix a bug, improve it)
176
177### Step 1: Get the vendored repository
178
1791. Open the vendor description file (`.vendor.hjson`) of the dependency you want to update and take note of the `url` and the `branch` in the `upstream` section.
180
1812. Clone the upstream repository and switch to the used branch:
182
183 ```command
184 $ # Go to your source directory (can be anywhere)
185 $ cd ~/src
186 $ # Clone the repository and switch the branch. Below is an example for ibex.
187 $ git clone https://github.com/lowRISC/ibex.git
188 $ cd ibex
189 $ git checkout master
190 ```
191
192After this step you're ready to make your modifications.
193You can do so *either* directly in the upstream repository, *or* start in the OpenTitan repository.
194
195### Step 2a: Make modifications in the upstream repository
196
197The easiest option is to modify the upstream repository directly as usual.
198
199### Step 2b: Make modifications in the OpenTitan repository
200
201Most changes to external code are motivated by our own needs.
202Modifying the external code directly in the `hw/vendor` directory is therefore a sensible starting point.
203
2041. Make your changes in the OpenTitan repository. Do not commit them.
205
2062. Create a patch with your changes. The example below uses `lowrisc_ibex`.
207
208 ```command
209 $ cd hw/vendor/lowrisc_ibex
210 $ git diff --relative . > changes.patch
211 ```
212
2133. Take note of the revision of the imported repository from the lock file.
214 ```command
215 $ cat hw/vendor/lowrisc_ibex.lock.hjson | grep rev
216 rev: 7728b7b6f2318fb4078945570a55af31ee77537a
217 ```
218
2194. Switch to the checked out upstream repository and bring it into the same state as the imported repository.
220 Again, the example below uses ibex, adjust as needed.
221
222 ```command
223 # Change to the upstream repository
224 $ cd ~/src/ibex
225
226 $ # Create a new branch for your patch
227 $ # Use the revision you determined in the previous step!
228 $ git checkout -b modify-ibex-somehow 7728b7b6f2318fb4078945570a55af31ee77537a
229 $ git apply -p1 < $REPO_BASE/hw/vendor/lowrisc_ibex/changes.patch
230
231 $ # Add and commit your changes as usual
232 $ # You can create multiple commits with git add -p and committing
233 $ # multiple times.
234 $ git add -u
235 $ git commit
236 ```
237
238### Step 3: Get your changes accepted upstream
239
240You have now created a commit in the upstream repository.
241Before submitting your changes upstream, rebase them on top of the upstream development branch, typically `master`, and ensure that all tests pass.
242Now you need to follow the upstream guidelines on how to get the change accepted.
243In many cases their workflow is similar to ours: push your changes to a repository fork on your namespace, create a pull request, work through review comments, and update it until the change is accepted and merged.
244
245### Step 4: Update the vendored copy of the external dependency
246
Miguel Osorioba17f112019-11-15 12:30:14 -0800247After your change is accepted upstream, you can update our copy of the code using the `util/vendor` tool as described before.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100248
249## How to vendor new code
250
Miguel Osorioba17f112019-11-15 12:30:14 -0800251Vendoring external code is done by creating a vendor description file, and then running the `util/vendor` tool.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100252
2531. Create a vendor description file for the new dependency.
254 1. Make note of the Git repository and the branch you want to vendor in.
255 2. Choose a name for the external dependency.
256 It is recommended to use the format `<vendor>_<name>`.
257 Typically `<vendor>` is the lower-cased user or organization name on GitHub, and `<name>` is the lower-cased project name.
258 3. Choose a target directory.
259 It is recommended use the dependency name as directory name.
260 4. Create the vendor description file in `hw/vendor/<vendor>_<name>.vendor.hjson` with the following contents (adjust as needed):
261
262 ```
263 // Copyright lowRISC contributors.
264 // Licensed under the Apache License, Version 2.0, see LICENSE for details.
265 // SPDX-License-Identifier: Apache-2.0
266 {
267 name: "lowrisc_ibex",
268 target_dir: "lowrisc_ibex",
269
270 upstream: {
271 url: "https://github.com/lowRISC/ibex.git",
272 rev: "master",
273 },
274 }
275 ```
276
2772. Create a new branch for a subsequent pull request
278
279 ```command
280 $ git checkout -b vendor-something upstream/master
281 ```
282
2833. Commit the vendor description file
284
285 ```command
286 $ git add hw/vendor/<vendor>_<name>.vendor.hjson
287 $ git commit
288 ```
289
Miguel Osorioba17f112019-11-15 12:30:14 -08002904. Run the `util/vendor` tool for the newly vendored code.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100291
292 ```command
293 $ cd $REPO_TOP
Miguel Osorioba17f112019-11-15 12:30:14 -0800294 $ ./util/vendor.py hw/vendor/lowrisc_ibex.vendor.hjson --verbose --commit
lowRISC Contributors802543a2019-08-31 12:12:56 +0100295 ```
296
2975. Push the branch to your fork for review (assuming `origin` is the remote name of your fork).
298
299 ```command
300 $ git push -u origin vendor-something
301 ```
302
303 Now go the GitHub web interface to create a Pull Request for the newly created branch.
304
305## How to exclude some files from the upstream repository
306
307You can exclude files from the upstream code base by listing them in the vendor description file under `exclude_from_upstream`.
308Glob-style wildcards are supported (`*`, `?`, etc.), as known from shells.
309
310Example:
311
312```
313// section of a .vendor.hjson file
314exclude_from_upstream: [
315 // exclude all *.h files in the src directory
Philipp Wagner841009a2020-05-19 16:19:11 +0100316 "src/*.h",
lowRISC Contributors802543a2019-08-31 12:12:56 +0100317 // exclude the src_files.yml file
318 "src_files.yml",
319 // exclude some_directory and all files below it
320 "some_directory",
321]
322```
323
Sam Elliotta24497b2020-04-23 15:06:46 +0100324If you want to add more files to `exclude_from_upstream`, just update this section of the `.vendor.hjson` file and re-run the vendor tool without `--update`.
325The repository will be re-cloned without pulling in upstream updates, and the file exclusions and patches specified in the vendor file will be applied.
326
lowRISC Contributors802543a2019-08-31 12:12:56 +0100327## How to add patches on top of the imported code
328
329In some cases the upstream code must be modified before it can be used.
Miguel Osorioba17f112019-11-15 12:30:14 -0800330For this purpose, the `util/vendor` tool can apply patches on top of imported code.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100331The patches are kept as separate files in our repository, making it easy to understand the differences to the upstream code, and to switch the upstream code to a newer version.
332
333To apply patches on top of vendored code, do the following:
334
3351. Extend the `.vendor.hjson` file of the dependency and add a `patch_dir` line pointing to a directory of patch files.
336 It is recommended to place patches into the `patches/<vendor>_<name>` directory.
337
338 ```
339 patch_dir: "patches/lowrisc_ibex",
340 ```
341
3422. Place patch files with a `.patch` suffix in the `patch_dir`.
343
Miguel Osorioba17f112019-11-15 12:30:14 -08003443. When running `util/vendor`, patches are applied on top of the imported code according to the following rules.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100345
346 - Patches are applied alphabetical order according to the filename.
347 Name patches like `0001-do-someting.patch` to apply them in a deterministic order.
348 - Patches are applied relative to the base directory of the imported code.
349 - The first directory component of the filename in a patch is stripped, i.e. they are applied with the `-p1` argument of `patch`.
350 - Patches are applied with `git apply`, making all extended features of Git patches available (e.g. renames).
351
Sam Elliotta24497b2020-04-23 15:06:46 +0100352If you want to add more patches and re-apply them without updating the upstream repository, add them to the patches directory and re-run the vendor tool without `--update`.
353
lowRISC Contributors802543a2019-08-31 12:12:56 +0100354## How to manage patches in a Git repository
355
356Managing patch series on top of code can be challenging.
357As the underlying code changes, the patches need to be refreshed to continue to apply.
358Adding new patches is a very manual process.
359And so on.
360
361Fortunately, Git can be used to simplify this task.
362The idea:
363
364- Create a forked Git repository of the upstream code
365- Create a new branch in this fork.
366- Commit all your changes on top of the upstream code into this branch.
Miguel Osorioba17f112019-11-15 12:30:14 -0800367- Convert all commits into patch files and store them where the `util/vendor` tool can find and apply them.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100368
Miguel Osorioba17f112019-11-15 12:30:14 -0800369The last step is automated by the `util/vendor` tool through its `--refresh-patches` argument.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100370
3711. Modify the vendor description file to add a `patch_repo` section.
372 - The `url` parameter specifies the URL to the fork of the upstream repository containing all modifications.
373 - The `rev_base` is the base revision, typically the `master` branch.
374 - The `rev_patched` is the patched revision, typically the name of the branch with your changes.
375
376 ```
377 patch_repo: {
378 url: "https://github.com/lowRISC/riscv-dbg.git",
379 rev_base: "master",
380 rev_patched: "changes",
381 },
382 ```
383
3842. Create commit and push to the forked repository.
385 Make sure to push both branches to the fork: `rev_base` **and** `rev_patched`.
386 In the example above, this would be (with `REMOTE_NAME_FORK` being the remote name of the fork):
387
388 ```command
389 git push REMOTE_NAME_FORK master changes
390 ```
391
Miguel Osorioba17f112019-11-15 12:30:14 -08003923. Run the `util/vendor` tool with the `--refresh-patches` argument.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100393 It will first check out the patch repository and convert all commits which are in the `rev_patched` branch and not in the `rev_base` branch into patch files.
394 These patch files are then stored in the patch directory.
Sam Elliotta24497b2020-04-23 15:06:46 +0100395 After that, the vendoring process continues as usual: changes from the upstream repository are downloaded if `--update` passed, all patches are applied, and if instructed by the `--commit` flag, a commit is created.
lowRISC Contributors802543a2019-08-31 12:12:56 +0100396 This commit now also includes the updated patch files.
397
398To update the patches you can use all the usual Git tools in the forked repository.
399
400- Use `git rebase` to refresh them on top of changes in the upstream repository.
401- Add new patches with commits to the `rev_patched` fork.
402- Remove patches or reorder them with Git interactive rebase (`git rebase -i`).
403
404It is important to update and push *both* branches in the forked repository: the `rev_base` branch and the `rev_patched` branch.
405Use `git log rev_base..rev_patched` (replace `rev_base` and `rev_patched` as needed) to show all commits which will be turned into patches.