3c5da69a23 ci: remove un-needed lint_run*.sh files (willcl-ark)
2aa288efdd ci: fix annoying docker warning (will)
dd1c5903e8 ci: add ccache hit-rate warning when < 75% (will)
f427284483 doc: Detail configuration of hosted CI runners (will)
3f339e99e0 ci: dynamically match makejobs with cores (will)
4393ffdd83 ci: remove .cirrus.yml (will)
bc41848d00 ci: port lint (will)
d290a8e6ea ci: port msan-depends (will)
9bbae61e3b ci: port tsan-depends (will)
bf7d536452 ci: port tidy (will)
549074bc64 ci: port centos-depends-gui (will)
58e38c3a04 ci: port previous-releases-depends-debug (will)
341196d75c ci: port fuzzer-address-undefined-integer-nodepends (will)
f2068f26c1 ci: port no-IPC-i686-DEBUG (will)
2a00b12d73 ci: port nowallet-libbitcoinkernel (will)
9c2514de53 ci: port mac-cross-gui-notests (will)
2c990d84a3 ci: force reinstall of kernel headers in asan (will)
884251441b ci: update asan-lsan-ubsan (will)
f253031cb8 ci: port arm 32-bit job (will)
04e7bfbceb ci: update windows-cross job (will)
cc1735d777 ci: add job to determine runner type (will)
020069e6b7 ci: add Cirrus cache host (will)
9c2b96e0d0 ci: have base install run in right dir (will)
18f6be09d0 ci: use docker build cache arg directly (will)
94a0932547 ci: use buildx in ci (will)
fdf64e5532 ci: add configure-docker action (will)
33ba073df7 ci: add REPO_USE_CIRRUS_RUNNERS (will)
b232b0fa5e ci: add caching actions (will)
b8fcc9fcbc ci: add configure environment action (will)
Pull request description:
This changeset migrates all current self-hosted CI jobs over to hosted [Cirrus Runners](https://cirrus-runners.app/).
These runners cost a flat rate of $150/month, and we qualify for an open source discount of 50%. Therefore they are $75/month/runner.
One "runner" should more accurately be thought of in terms of the number of vCPU you are purchasing: https://cirrus-runners.app/pricing/ or in terms of "concurrency", where 1 runners gets you 1.0 concurrency.
e.g. a Linux x86 Runner gets you 16 vCPU (1.0 concurrency) and 64GB RAM to be provisioned as you choose, amongst one or more jobs.
Cirrus Runners currently only support Linux (x86 and Arm64) and MacOS (Arm64).
This changeset does **not** move the existing Github Actions native MacOS runners away from being run on Github's infrastructure. This could be a follow up optimisation.
Runs from this changeset using Cirrus Runners can be found at: https://github.com/testing-cirrus-runners/bitcoin2/actions which shows an uncached run on master ([CI#1](https://github.com/testing-cirrus-runners/bitcoin2/actions/runs/16298637161)), an outside pull request ([CI#3](https://github.com/testing-cirrus-runners/bitcoin2/actions/runs/16303305483?pr=1)) and an updated push to master ([CI#4](https://github.com/testing-cirrus-runners/bitcoin2/actions/runs/16304182527)).
These workflows were run on 10 runners, and we would recommend purchasing a similar number for our CI in this repo to achieve the speed and concurrency we expect.
We include some optional performance commits, but these could be split out and made into followups or dropped entirely.
## Benefits
### Maintenance
As we are not self-hosting, nobody needs to maintain servers, disks etc.
### Bus factor
Currently we have a very small number of people with the know-how working on server setup and maintenance. This setup fixes that so that "anyone" familiar with GitHub-style CI systems can work on it.
### Scaling
These do _not_ "auto-scale"/have "unlimited concurrency" like some solutions, but if we want more workers/cpu to increase parallism or increase the runner size of certain jobs for a speed-up we can simply buy more concurrency using the web interface.
### Speed
Runtimes aproximate current runtimes pretty well, with some jobs being faster.
Caching improvements on pull request (re-runs) are left as future optimisations from the current changeset (see below).
### GitHub workflow syntax
With a migration to the more-commonly-used GitHub workflow syntax, migration to other providers in the future is often as simple as a one-line change (and installing a new GitHub app to the repo).
If we decide to self-host again, then we can also self-host GitHub runners (using https://github.com/actions/runner) and maintain new GH-style CI syntax.
### Reporting
GitHub workflows provide nicer built-in reporting directly on the "Checks" page of a pr. This includes more-detailed action reporting, and a host of pretty nice integrated features, such as [Workflow Commands](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions) for creating annotations that can print messages during runs. See for example at the bottom of this window where we report `ccache` hitrate, if it was below 90%: https://github.com/testing-cirrus-runners/bitcoin/actions/runs/16163449125?pr=1
These could be added conditionally into our CI scripts to report interesting or other information.
## Costs
### Financial
Relative to competitors Cirrus runners are cheap for the hosted CI-world. However these are likely more expensive than our current setup, or a well-configured (new) self-hosted setup.
If we started with 10 runners to be shared amongst all migrated jobs, this would total $750/mo = $9000/yr.
Note that we are not trying to comptete here on cost directly.
### Dependencies
We would be dependent on Cirrus infra.
## Forks
- Forks should be able to run CI without paid Cirrus runners. This behaviour is achieved through a rather verbose `runs-on:` directive.
- This directive hardcodes the main repo (unfortunately you cannot use the `env` github context in this field in particular, for some reason).
- This directive also allows for a fork to patch the `runs-on:` field in the ci.yml file if they want to use Cirrus Runners too.
- The workflow otherwise will fallback to the GitHub free runners on forks.
- This cirrus cache action transparently falls back to github actions cache when not running on cirrus, so forks will get some free github caching (10GB per repo).
All jobs work on forks, but will run (slowly) on GitHub native free hosted runners, instead of Cirrus runners. They will also suffer from poor cache hit-rates, but there's nothing that can be done about that, and the situtation is an improvement on today.
## Migration process
The main org should also, in addition to pulling code changes:
1. Permit the actions `docker/setup-buildx-action@v3` and `docker/login-action@v3` to be run in this repo.
## Caching
For the number of CI jobs we have, cache usage on GitHub would be an issue as GH only provides 10GB of cache space, **per repo**. However cirrus provides [10 GB per runner](https://cirrus-runners.app/setup/#speeding-up-the-cache), which scales better with the number of runners.
The `cirruslabs/action/[restore|save]` action we use here redirects this to Cirrus' own cache and is both faster and larger.
In the case that user is running CI on a fork, the cirrus cache falls back transparently to GitHub default cache without error.
### ccache, depends-sources, built-depends
- Cached as blobs via `cirruslabs/actions/cache` action.
- Current implementation:
- On `push`: restores and saves caches.
- On `pull_request`: restores but does **not** save caches.
This means a new pull request should hit a _pretty relevant_ cache.
Old pull requests **which are not being rebased on master** may suffer from lower cache hit-rate.
If we save caches on all pull request runs we run the risk of evicting recent (and more relevant) cache blobs.
It may be possible in a future optimisation to widen this to save on pull request runs too, but it will also depend on how many runners we provision and what cache churn rates are like in the main repo.
### Docker build layer caching
- Cached using the `gha` cache backend
- These cache blobs compete for space with `ccache`, `depends-sources` and `depends-built` caches
- `gha` cache allows `--cache-from` to be used from pull requests, which does not work using a registry cache type (technically we could use a public read-only token to get this working, but that feels wrong)
This backend does network i/o and so are marginally slower than our current disk i/o cache.
## But what about... `x`?
We have tested many other providers, including [Runs-on](https://runs-on.com/), [Buildjet](https://buildjet.com/), [WarpBuild](https://www.warpbuild.com/), and GitHub hosted runners (and investigated even more). But they all fall short in one-way or another.
- Runs-On and Buildjet (and others) require installing GH apps with much too-liberal permissions (e.g. `Administration: Read|Write`) for our use-case.
- GitHub hosted runners suffer from all of high costs, lower speed, small cache, and the requirement for a GitHub Teams subscription.
- WarpBuild seems to be simply too expensive.
## TODO:
To complete migration from self-hosted to hosted for this repo, the backport branches `27.x`, `28.x` and `29.x` would also need their CI ported, but these are left for followups to this change (and pending review/changes here first).
-----
Work and experimentation undertaken with m3dwards
ACKs for top commit:
maflcko:
re-ACK 3c5da69a23 🏗
m3dwards:
ACK 3c5da69a23
achow101:
ACK 3c5da69a23
janb84:
re ACK 3c5da69a23
Tree-SHA512: 9f7f2dddf1a5eebc56b4101663283d4219d189cda6054dba760f1288bed9e6ed3f2fa029a5caedc76c31b1271ea0a0cb0967a796086360d8f5be8277379b6397
2885bd0e1c doc: unify `datacarriersize` warning with release notes (Lőrinc)
Pull request description:
Follow-up to https://github.com/bitcoin/bitcoin/pull/32406
---
The [release notes](a189d63618/doc/release-notes-32406.md (L1)) claim
> [...] marked as deprecated and are expected to be removed in a future release
but the [warning itself](2885bd0e1c/src/init.cpp (L907)) claims
> [...] marked as deprecated. They **will** be removed in a future version.
To be less aggressive (since some have objected against this version online) - and to unify the deprecation warning with the release notes - I have changed the warning to communicate our expectation in a friendlier way.
ACKs for top commit:
cedwies:
ACK 2885bd0
ryanofsky:
Code review ACK 2885bd0e1c. I don't think it is good for the release notes and the runtime warning message to say two different things. I'd also be happy if release notes were updated to match the runtime warning, instead of vice versa. Whatever is more accurate is better.
ajtowns:
ACK 2885bd0e1c
kevkevinpal:
ACK [2885bd0](2885bd0e1c)
achow101:
ACK 2885bd0e1c
janb84:
ACK 2885bd0e1c
Zero-1729:
crACK 2885bd0e1c
jonatack:
ACK 2885bd0e1c
hodlinator:
ACK 2885bd0e1c
w0xlt:
ACK 2885bd0e1c
optout21:
ACK 2885bd0e1c
Tree-SHA512: a9d2a64ab96b3dd7f3a1a29622930054fd5c56e573bc96330f4ef3327dc024b21b3fbc8a698d17aea7c76f57f0c2ccd6403b2df344ae2f69c645ceb8b6fa54a5
ci/lint_run.sh: Only used in .cirrus.yml. Refer to test/lint/README.md on how to run locally.
ci/lint_run_all.sh: Only used in .cirrus.yml for stale re-runs of old pull request tasks.
Docker currently warns that we are missing a default value.
Set this to scratch which will error if an appropriate image tag is not
passed in to silence the warning.
Previously jobs were running on a large multi-core server where 10 jobs
as default made sense (or may even have been on the low side).
Using hosted runners with fixed (and lower) numbers of vCPUs we should
adapt compilation to match the number of cpus we have dynamically.
This is cross-platform compatible with macos and linux only.
When using hosted runners in combination with cached docker images,
there is the possibility that the host runner image is updated,
rendering the linux-headers package (stored in the cached docker image)
incompatible.
Fix this by doing a re-install of the headers package in
03_test_script.sh.
If the underlying runner kernel has not changed thie has no effect, but
prevents the job from failing if it has.
To remove multiple occurances of the respository name, against which we
compare `${{ github.repository }}` to check if we should use Cirrus
Runners, introduce a helper job which can check a single environment
variable and output this as an input to subsequent jobs.
Forks can maintain a trivial patch of their repo name against the
`REPO_USE_CIRRUS_RUNNERS` variable in ci.yml if they have Cirrus Runners
of their own, which will then enable cache actions and docker build
cache to use Cirrus Cache.
It's not possible to use `${{ env.USE_CIRRUS_RUNNERS }}` in the
`runs-on:` directive as the context is not supported by GitHub.
If it was, this job would no longer be necessary.
Whilst the action cirruslabs/actions/cache will automatically set this
host, the docker `gha` build cache backend will not be aware of it.
Set the value here, which will later be used in the docker build args to
enable docker build cache on the cirrus cache.
This sets the build dir at build time so that Apple SDK gets installed
in the correct/expected location for the runtime to find it.
Co-authored-by: Max Edwards <youwontforgetthis@gmail.com>
Reverts: e87429a2d0
This was added in PR #31545 with the intention that self-hosted runners
might use it to save build cache.
As we are not using hosted runners with a registry build cache, the bulk
of this commit can be reverted, simply using the value of
$DOCKER_BUILD_CACHE_ARG in the script.
link: https://github.com/bitcoin/bitcoin/pull/31545
Using buildx is required to properly load the correct driver, for use
with registry caching. Neither build, nor BUILDKIT=1 currently do this
properly.
Use of `docker buildx build` is compatible with podman.
Another action to reduce boilerplate in the main ci.yml file.
This action will set up a docker builder compatible with caching build
layers to a container registry using the `gha` build driver.
It will then configure the docker build cache args.
If set, Cirrus runners will be used on pushes to, and pull requests
against, this repository.
Forks can set this if they have their own cirrus runners.
Add "Restore" and "Save" caching actions.
These actions reduce boilerplate in the main ci.yml configuration file.
These actions are implemented so that caches will be saved on `push`
only.
When a pull request is opened it will cache hit on the caches from the
lastest push, or in the case of depends will hit on any matching depends
hash, falling back to partial matches.
Depends caches are hashed using
`$(git ls-tree HEAD depends "ci/test/$FILE_ENV" | sha256sum | cut -d' ' -f1)`
and this hash is passed in as an input to the actions. This means we
direct cache hit in cases where depends would not be re-built, otherwise
falling back to a partial match.
Previous releases cache is hashed similarly to depends, but using the
test/get_previous_releases.py file.
The cirruslabs cache action will fallback transparently to GitHub's
cache in the case that the job is not being run on a Cirrus Runner,
making these compatible with running on forks (on free GH hardware).
b7b249d3ad Revert "[refactor] rewrite vTxHashes as a vector of CTransactionRef" (Anthony Towns)
b9300d8d0a Revert "refactor: Simplify `extra_txn` to be a vec of CTransactionRef instead of a vec of pair<Wtxid, CTransactionRef>" (Anthony Towns)
df5a50e5de bench/blockencodings: add compact block reconstruction benchmark (Anthony Towns)
Pull request description:
Reconstructing compact blocks is on the hot path for block relay, so revert changes from #28391 and #29752 that made it slower. Also add a benchmark to validate reconstruction performance, and a comment giving some background as to the approach.
ACKs for top commit:
achow101:
ACK b7b249d3ad
polespinasa:
lgtm code review and tested ACK b7b249d3ad
cedwies:
code-review ACK b7b249d
davidgumberg:
crACK b7b249d3ad
instagibbs:
ACK b7b249d3ad
Tree-SHA512: dc266e7ac08281a5899fb1d8d0ad43eb4085f8ec42606833832800a568f4a43e3931f942d4dc53cf680af620b7e893e80c9fe9220f83894b4609184b1b3b3b42
493ba0f688 threading: reduce the scope of lock in getblocktemplate (kevkevinpal)
Pull request description:
This change was motivated by https://github.com/bitcoin/bitcoin/pull/32592#discussion_r2294770722
It does exactly what is said in the comment. Reducing the scope of the lock by a bit before it is needed
ACKs for top commit:
stickies-v:
re-ACK 493ba0f688
maflcko:
lgtm ACK 493ba0f688
Tree-SHA512: aa3a21ef3da6be6c0af78aa2dda61ee21c3f6d4d9c897413dba9e7d7d2a91e9e069bbc6b6684b45aadaa28d8603dd310f2c2d2e58c31bb4d864204e468fefaf1
509ffea40a ci: return to using dash in CentOS job (fanquake)
Pull request description:
`dash` is available again: https://bugzilla.redhat.com/show_bug.cgi?id=2335416.
ACKs for top commit:
maflcko:
lgtm ACK 509ffea40a
davidgumberg:
ACK 509ffea40a
janb84:
crACK 509ffea40a
Tree-SHA512: c57194b6158f6453cadb2487be232af5e37aa2234852f04a76fc80909fbfa48c7f8dd30e7be41be67dedb7ec4886930e165fdbaf746d358bb94c6ccc49d6bde6