Upgrades Hydra to the latest master/flake branch. To perform this
upgrade, it's needed to do a non-trivial db-migration which provides a
massive performance-improvement[1].
The basic ideas behind multi-step upgrades of services between NixOS versions
have been gathered already[2]. For further context it's recommended to
read this first.
Basically, the following steps are needed:
* Upgrade to a non-breaking version of Hydra with the db-changes
(columns are still nullable here). If `system.stateVersion` is set to
something older than 20.03, the package will be selected
automatically, otherwise `pkgs.hydra-migration` needs to be used.
* Run `hydra-backfill-ids` on the server.
* Deploy either `pkgs.hydra-unstable` (for Hydra master) or
`pkgs.hydra-flakes` (for flakes-support) to activate the optimization.
The steps are also documented in the release-notes and in the module
using `warnings`.
`pkgs.hydra` has been removed as latest Hydra doesn't compile with
`pkgs.nixStable` and to ensure a graceful migration using the newly
introduced packages.
To verify the approach, a simple vm-test has been added which verifies
the migration steps.
[1] https://github.com/NixOS/hydra/pull/711
[2] https://github.com/NixOS/nixpkgs/pull/82353#issuecomment-598269471
* nixos/buildkite: drop user option
This reverts 8c6b1c3eaa.
Turns out, buildkite-agent has logic to write .ssh/known_hosts files and
only really works when $HOME and the user homedir are in sync.
On top of that, we provision ssh keys in /var/lib/buildkite-agent, which
doesn't work if that other users' homedir points elsewhere (we can cheat
by setting $HOME, but then getent and $HOME provide conflicting
results).
So after all, it's better to only run the system-wide buildkite agent as
the "buildkite-agent" user only - if one wants to run buildkite as
different users, systemd user services might be a better fit.
* nixosTests.buildkite-agent: add node with separate user and no ssh key
This gets passed to BUILDKITE_SHELL, which will specify the shell being
used to executes script in.
Defaults to `${pkgs.bash}/bin/bash -e -c`, matching how buildkite
behaves on other distros.
SSH public keys aren't needed to clone private repos, and if we only
need to configure a single attribute, there's no need for the "openssh"
attrset anymore.
This applies [hydra PR #432](https://github.com/NixOS/hydra/pull/432)
to the NixOS module in nixpkgs:
```
commit 4efd078977e5ea20e1104783efc324cba11690bc
Author: Bas van Dijk <v.dijk.bas@gmail.com>
Date: Sun Dec 11 15:35:38 2016 +0100
Only set buildMachinesFiles when nix.buildMachines is defined
```
The following commit from 2016 in hydra removed the `--option
build-use-substitutes` from the hydra-queue-runner service:
```
commit ee2e9f5335c8c0288c102975b506f6b275793cfe
Author: Eelco Dolstra <edolstra@gmail.com>
Date: Fri Oct 7 20:23:05 2016 +0200
Update to reflect BinaryCacheStore changes
BinaryCacheStore no longer implements buildPaths() and ensurePath(),
so we need to use copyPath() / copyClosure().
```
It would be better if the hydra module in NixOS matches the upstream
module.
During the last update, `hydra-notify` was rewritten as a daemon which
listens to postgresql notifications for each build[1]. The module
uses the `hydra-notify.service` unit from upstream's Hydra module and
the VM test ensures that email notifications are sent properly.
Also updated `hydra-init.service` to install `pg_trgm` on a local
database if needed[2].
[1] c7861b85c4
[2] 8a0a5ec3a3
Currently there are two calls to curl in the reloadScript, neither which
check for errors. If something is misconfigured (like wrong authToken),
the only trace that something wrong happened is this log message:
Asking Jenkins to reload config
<h1>Bad Message 400</h1><pre>reason: Illegal character VCHAR='<'</pre>
The service isn't marked as failed, so it's easy to miss.
Fix it by passing --fail to curl.
While at it:
* Add $curl_opts and $jenkins_url variables to keep the curl command
lines DRY.
* Add --show-error to curl to show short error message explanation when
things go wrong (like HTTP 401 error).
* Lower-case the $CRUMB variable as upper case is for exported environment
variables.
The new behaviour, when having wrong accessToken:
Asking Jenkins to reload config
curl: (22) The requested URL returned error: 401
And the service is clearly marked as failed in `systemctl --failed`.
@cleverca found this bug in the declarative hooks config. Any shell
variables referenced in a hook script would get expanded by the hooks
directory builder.
Prevent variable expansion by quoting the here doc limit string.
Pass the -L flag to curl to make it follow redirects. This fixes an
issue I found when setting up reverse proxy for Jenkins. Without this
fix, the returned HTTP code was stuck at 302, making postStart fail the
service (it expects 200 or 403).