As @edolstra pointed out that the kernel module might be painful to
maintain. I strongly disagree because it's only a small module and it's
good to have such a canary in the tests no matter how the bootup process
looks like, so I'm going the masochistic route and try to maintain it.
If it *really* becomes too much maintenance burden, we can still drop or
disable kcanary.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We don't want to push out a channel update whenever this test fails,
because that might have unexpected and confused side effects and it
*really* means that stage 1 of our boot up is broken.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We already have a small regression test for #15226 within the swraid
installer test. Unfortunately, we only check there whether the md
kthread got signalled but not whether other rampaging processes are
still alive that *should* have been killed.
So in order to do this we provide multiple canary processes which are
checked after the system has booted up:
* canary1: It's a simple forking daemon which just sleeps until it's
going to be killed. Of course we expect this process to not
be alive anymore after boot up.
* canary2: Similar to canary1, but tries to mimick a kthread to make
sure that it's going to be properly killed at the end of
stage 1.
* canary3: Like canary2, but this time using a @ in front of its
command name to actually prevent it from being killed.
* kcanary: This one is a real kthread and it runs until killed, which
shouldn't be the case.
Tested with and without 67223ee and everything works as expected, at
least on my machine.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This is a regression test for #15226, so that the test will fail once we
accidentally kill one or more of the md kthreads (aka: if safe mode is
enabled).
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Unfortunately, pkill doesn't distinguish between kernel and user space
processes, so we need to make sure we don't accidentally kill kernel
threads.
Normally, a kernel thread ignores all signals, but there are a few that
do. A quick grep on the kernel source tree (as of kernel 4.6.0) shows
the following source files which use allow_signal():
drivers/isdn/mISDN/l1oip_core.c
drivers/md/md.c
drivers/misc/mic/cosm/cosm_scif_server.c
drivers/misc/mic/cosm_client/cosm_scif_client.c
drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
drivers/staging/rtl8188eu/core/rtw_cmd.c
drivers/staging/rtl8712/rtl8712_cmd.c
drivers/target/iscsi/iscsi_target.c
drivers/target/iscsi/iscsi_target_login.c
drivers/target/iscsi/iscsi_target_nego.c
drivers/usb/atm/usbatm.c
drivers/usb/gadget/function/f_mass_storage.c
fs/jffs2/background.c
fs/lockd/clntlock.c
fs/lockd/svc.c
fs/nfs/nfs4state.c
fs/nfsd/nfssvc.c
While not all of these are necessarily kthreads and some functionality
may still be unimpeded, it's still quite harmful and can cause
unexpected side-effects, especially because some of these kthreads are
storage-related (which we obviously don't want to kill during bootup).
During discussion at #15226, @dezgeg suggested the following
implementation:
for pid in $(pgrep -v -f '@'); do
if [ "$(cat /proc/$pid/cmdline)" != "" ]; then
kill -9 "$pid"
fi
done
This has a few downsides:
* User space processes which use an empty string in their command line
won't be killed.
* It results in errors during bootup because some shell-related
processes are already terminated (maybe it's pgrep itself, haven't
checked).
* The @ is searched within the full command line, not just at the
beginning of the string. Of course, we already had this until now, so
it's not a problem of his implementation.
I posted an alternative implementation which doesn't suffer from the
first point, but even that one wasn't sufficient:
for pid in $(pgrep -v -f '^@'); do
readlink "/proc/$pid/exe" &> /dev/null || continue
echo "$pid"
done | xargs kill -9
This one spawns a subshell, which would be included in the processes to
kill and actually kills itself during the process.
So what we have now is even checking whether the shell process itself is
in the list to kill and avoids killing it just to be sure.
Also, we don't spawn a subshell anymore and use /proc/$pid/exe to
distinguish between user space and kernel processes like in the comments
of the following StackOverflow answer:
http://stackoverflow.com/a/12231039
We don't need to take care of terminating processes, because what we
actually want IS to terminate the processes.
The only point where this (and any previous) approach falls short if we
have processes that act like fork bombs, because they might spawn
additional processes between the pgrep and the killing. We can only
address this with process/control groups and this still won't save us
because the root user can escape from that as well.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Fixes: #15226
Instead of using this option, please modify the dovecot package by means of an
override. For example:
nixpkgs.config.packageOverrides = super: {
dovecot = super.dovecot.override { withPgSQL = true; };
};
Closes https://github.com/NixOS/nixpkgs/issues/14097.
Just removing the system argument because it doesn't exist (it's
actually config.nixpkgs.system, which we're already using). We won't get
an error anyway if we're not actually using it, so this is just an
aesthetics fix.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Make sure that we always have everything available within the store of
the VM, so let's evaluate/build the test container fully on the host
system and propagate all dependencies to the VM.
This way, even if there are additional default dependencies that come
with containers in the future we should be on the safe side as these
dependencies should now be included for the test as well.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @kampfschlaefer, @edolstra
This partially reverts f2d24b9840.
Instead of disabling the channels via removing the channel mapping from
the tests themselves, let's just explicitly reference the stable test in
release.nix. That way it's still possible to run the beta and dev tests
via something like "nix-build nixos/tests/chromium.nix -A beta" and
achieve the same effect of not building beta and dev versions on Hydra.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
It's not the job of Nixpkgs to distribute beta versions of upstream
packages. More importantly, building these delays channel updates by
several hours, which is bad for our security fix turnaround time.
* Perform HTTP HEAD request instead of full GET (lighter weight)
* Don't log output of curl to the journal (it's noise/debug)
* Use explicit http:// URL scheme
* Reduce poll interval from 10s to 2s (respond to state changes
quicker). Probably not relevant on boot (lots of services compete for
the CPU), but online service restarts/reloads should be quicker.
* Pass --fail to curl (should be more robust against false positives)
* Use 4 space indent for shell code.
The current postStart code holds Jenkins off the "started" state until
Jenkins becomes idle. But it should be enough to wait until Jenkins
start handling HTTP requests to consider it "started".
More reasons why the current approach is bad and we should remove it,
from @coreyoconnor in
https://github.com/NixOS/nixpkgs/issues/14991#issuecomment-216572571:
1. Repeatedly curling for a specific human-readable string to
determine "Active" is fragile. For instance, what happens when jenkins
is localized?
2. The time jenkins takes to initializes is variable. This (at least
used to) depend on the number of jobs and any plugin upgrades requested.
3. Jenkins can be requested to restart from the UI. Which will not
affect the status of the service. This means that the service being
"active" does not imply jenkins is initialized. Downstream services
cannot assume jenkins is initialized if the service is active. Might
as well accept that and remove the initialized test from service
startup.
Fixes#14991.
Regression introduced by dfe608c8a2.
The commit turns the two arguments into one attrset argument so we need
to adapt that to use the new calling convention.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This allows to use <olink> tags inside NixOS options to reference
sections from the manual. I've originally introduced it in #14476 to
reference the Taskserver specific documentation from the options
reference but as suggested by @nbp, this was done as a separate pull
request to ensure greater visibility rather than being "hidden" in the
Taskserver branch.
The build time for the manual is around 30s on my machine without this
change and 34s with this change, so it shouldn't have a very big impact
on the build time of the manual.
Olinks between the options reference and the manual now will look like
this:
"More instructions about NixOS in conjuction with Taskserver can be
found in the NixOS manual at Chapter 15, Taskserver."
More documentation about olinks can be found here:
http://www.sagehill.net/docbookxsl/Olinking.html
Acked-by: Eelco Dolstra <eelco.dolstra@logicblox.com>
A sane backend for recent brother scanners.
Depends on the presence of etc files generated by the
nixos module of the same name.
Supports network scanner specification through the
nixos module.
The Nix store squashfs is stored inside the initrd instead of separately
(cherry picked from commit 976fd407796877b538c470d3a5253ad3e1f7bc68)
Signed-off-by: Domen Kožar <domen@dev.si>
A user noticed the example for `hosts`, took the `mode` permissions literally, and ended up with surprising behavior on their system. Updating the documentation to not reference a real config file which might have real permissions requirements.
The pre-sleep service exits if any command fails. Unloading facetimehd
without it being loaded blocks subsequent commands from running.
Note: `modprobe -r` works a bit better when unloading unused modules,
and is preferrable to `rmmod`. However, the facetimehd module does not
support suspending. In this case, it seems preferable to forcefully
unload the module. `modprobe` does not support a `--force` flag when
removing, so we are left with `rmmod`.
See:
- https://github.com/NixOS/nixpkgs/pull/14883
- https://github.com/patjak/bcwc_pcie/wiki#known-issues
Basic hardening
- Run as nobody:nogroup with a private /tmp, /home & /run/user
- Create working directory under /run (hoogle insists on writing to cwd
and otherwise returns "something went wrong" to every query)
Option tweaks
- Provide a default for the haskellPackage option
- Set text values for defaults
- Move hoogleEnv to the top-level & simplify it
This command was useful when NixOS was spread across multiple
repositories, but now it's pretty pointless (and obfuscates what
happens, i.e. "git clone git://github.com/NixOS/nixpkgs.git").
Note: I ignored the C++ libraries, but it appears we're not currently
using them. Once we do, we'll probably want to put them in a separate
output as well (to prevent non-C++ users from depending on Boost).
Two fixes:
Not really sure why removing `--fail` from the curl calls is necessary,
but with that option, curl erronously reports 404 (which it shouldn't
per my interactive vm testing).
Fix paths to example files used for the printing test
Toghether, these changes allow the test to run to completion on my machine.
Need to pass `cups.out` to `systemd.packages`, lest we end up with an invalid
generated unit containing only directives set in the service module.
This patch gives us a valid cups.service unit but, vexingly, does not fix the
test failure at NixOS/nixpkgs#14748
`dbus-launch` is executed early in the script, before desktop managers
had a chance to setup the environment. If DBus activation is used,
applications launched by this may therefore lack necessary environment
variables. This patch sends the complete environment to DBus after
launching the desktop manager.
With the merge of the closure-size branch, most packages now have
multiple outputs. One of these packages is gnutls, so previously
everything that we needed was to reference "${gnutls}/bin/..." and now
we need to use "${gnutls.bin}/bin/...".
So it's not a very big issue to fix.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This adds a Taskserver module along with documentation and a small
helper tool which eases managing a custom CA along with Taskserver
organisations, users and groups.
Taskserver is the server component of Taskwarrior, a TODO list
application for the command line.
The work has been started by @matthiasbeyer back in mid 2015 and I have
continued to work on it recently, so this merge contains commits from
both of us.
Thanks particularly to @nbp and @matthiasbeyer for reviewing and
suggesting improvements.
I've tested this with the new test (nixos/tests/taskserver.nix) this
branch adds and it fails because of the changes introduced by the
closure-size branch, so we need to do additional work on base of this.
This reverts commit 1d77dcaed3.
It will be reintroduced along with #14700 as a separate branch, as
suggested by @nbp.
I added this to this branch because I thought it was a necessary
dependency, but it turns out that the build of the manual/manpages still
succeeds and merely prints a warning like this:
warning: failed to load external entity "olinkdb.xml"
Olink error: could not open target database 'olinkdb.xml'.
Error: unresolved olink: targetdoc/targetptr = 'manual/module-taskserver'.
The olink itself will be replaced by "???", so users looking at the
description of the option in question will still see the reference to
the NixOS manual, like this:
More instructions about NixOS in conjuction with Taskserver can be found
in the NixOS manual at ???.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
My first attempt to do this was to just use a conditional <refsection/>
in order to not create exact references in the manpage but create the
reference in the HTML manual, as suggested by @edolstra on IRC.
Later I went on to use <olink/> to reference sections of the manual, but
in order to do that, we need to overhaul how we generate the manual and
manpages.
So, that's where we are now:
There is a new derivation called "manual-olinkdb", which is the olinkdb
for the HTML manual, which in turn creates the olinkdb.xml file and the
manual.db. The former contains the targetdoc references and the latter
the specific targetptr elements.
The reason why I included the olinkdb.xml verbatim is that first of all
the DTD is dependent on the Docbook XSL sources and the references
within the olinkdb.xml entities are relative to the current directory.
So using a store path for that would end up searching for the manual.db
directly in /nix/store/manual.db.
Unfortunately, the <olinks/> that end up in the output file are
relative, so for example if you're clicking on one of these within the
PDF, the URL is searched in the current directory.
However, the sections from the olink's text are still valid, so we could
use an alternative URL for that in the future.
The manual doesn't contain any links, so even referencing the relative
URL shouldn't do any harm.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @edolstra
Suggested by @nbp:
"Choose a better organization name in this example, such that it is less
confusing. Maybe something like my-company"
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
It was failing with a `Read-only filesystem` failure due to the systemd
service option `ReadWriteDirectories` not being correctly configured.
Fixes#14132
Continuation of 79c3c16dcbb3b45c0f108550cb89ccd4fc855e3b. Systemd 229
sets the default RLIMIT_CORE to infinity, causing systems to be
littered with core dumps when systemd.coredump.enable is disabled.
This restores the 15.09 soft limit of 0 and hard limit of infinity.
This module adds support for defining a flexget service.
Due to flexget insisting on being able to write all over where it finds
its configuration file, we use a ExecStartPre hook to copy the generated
configuration file into place under the user's home. It's fairly ugly
and I'm very open to suggestions
Coreutils is multi-output and the `info` output doesn't seem to be
included on the install disk, failing like this (because now nix-env
wants to build coreutils):
````
machine# these derivations will be built:
machine# /nix/store/0jk4wzg11sa6cqyw8g7w5lb35axji969-bison-3.0.4.tar.gz.drv
...
machine# /nix/store/ybjgqwxx63l8cj1s7b8axx09wz06kxbv-coreutils-8.25.drv
machine# building path(s) ‘/nix/store/4xvdi5740vq8vlsi48lik3saz0v5jsx0-coreutils-8.25.tar.xz’
machine# downloading ‘http://ftpmirror.gnu.org/coreutils/coreutils-8.25.tar.xz’...
machine# error: unable to download ‘http://ftpmirror.gnu.org/coreutils/coreutils-8.25.tar.xz’: Couldn't resolve host name (6)
machine# builder for ‘/nix/store/5j3bc5sjr6271fnjh9gk9hrid8kgbpx3-coreutils-8.25.tar.xz.drv’ failed with exit code 1
machine# cannot build derivation ‘/nix/store/ybjgqwxx63l8cj1s7b8axx09wz06kxbv-coreutils-8.25.drv’: 1 dependencies couldn't be built
machine# error: build of ‘/nix/store/ybjgqwxx63l8cj1s7b8axx09wz06kxbv-coreutils-8.25.drv’ failed
````
We have already revamped the CLI subcommands in commit
e2383b84f8.
This was just an artifact that was left because of this.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
The options client.allow and client.deny are gone since the commit
8b793d1916, so let's fix that.
No feature changes, only fixes the descriptions of allowedClientIDs and
disallowedClientIDs.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This is the recommended way for long-running services and ensures that
Taskserver will keep running until it has been stopped manually.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Using requiredBy is a bad idea for the initialisation units, because
whenever the Taskserver service is restarted the initialisation units
get restarted as well.
Also, make sure taskserver-init.service will be ordered *before*
taskserver.service.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
The Taskserver doesn't need access to the full /dev nor does it need a
shared /tmp. In addition, the initialisation services don't need network
access, so let's constrain them to the loopback device.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Apart from the options manual, this should cover the basics for setting
up a Taskserver. I am not a native speaker so this can and (probably)
should be improved, especially the wording/grammar.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Try to match the subcommands to act more like the subcommands from the
taskd binary and also add a subcommand to list groups.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
As suggested by @matthiasbeyer:
"We might add a short note that this port has to be opened in the
firewall, or is this done by the service automatically?"
This commit now adds the listenPort to
networking.firewall.allowedTCPPorts as soon as the listenHost is not
"localhost".
In addition to that, this is now also documented in the listenHost
option declaration and I have removed disabling of the firewall from the
VM test.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
No changes in functionality but rather just restructuring the module
definitions to be one mkMerge, which now uses mkIf from the top-level
scope of the CA initialization service so we can better abstract
additional options we might need there.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We want to make sure that the helper tool won't work if the automatic CA
wasn't properly set up. This not only avoids race conditions if the tool
is started before the actual service is running but it also fails if
something during CA setup has failed so the user can investigate what
went wrong.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We need to explicitly make sure the CA is created before we actually
launch the main Taskserver service in order to avoid race conditions
where the preStart phase of the main service could possibly corrupt
certificates if it would be started in parallel.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This is simply to add configuration lines to the generated configuration
file. The reason why I didn't went for an attribute set is that the
taskdrc file format doesn't map very well on Nix attributes, for example
the following can be set in taskdrc:
server = somestring
server.key = anotherstring
In order to use a Nix attribute set for that, it would be way too
complicated, for example if we want to represent the mentioned example
we'd have to do something like this:
{ server._top = somestring;
server.key = anotherstring;
}
Of course, this would work as well but nothing is more simple than just
appending raw strings.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
At least this should allow for some customisation of how the
certificates and keys are created. We now have two sub-namespaces within
PKI so it should be more clear which options you have to set if you want
to either manage your own CA or let the module create it automatically.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Whenever the nixos-taskserver tool was invoked manually for creating an
organisation/group/user we now add an empty file called .imperative to
the data directory.
During the preStart of the Taskserver service, we use process-json which
in turn now checks whether those .imperative files exist and if so, it
doesn't do anything with it.
This should now ensure that whenever there is a manually created user,
it doesn't get killed off by the declarative configuration in case it
shouldn't exist within that configuration.
In addition, we also add a small subtest to check whether this is
happening or not and fail if the imperatively created user got deleted
by process-json.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We only print the output whenever there is an error, otherwise let's
shut it up because it only shows information the user can gather through
other means. For example by invoking certtool manually, or by just
looking at private key files (the whole blurb it's outputting is in
there as well).
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We were putting the whole output of "nixos-taskserver export-user" from
the server to the respective client and on every such operation the
whole output was shown again in the test log.
Now we're *only* showing these details whenever a user import fails on
the client.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Now we finally can delete organisations, groups and users along with
certificate revocation. The new subtests now make sure that the client
certificate is also revoked (both when removing the whole organisation
and just a single user).
If we use the imperative way to add and delete users, we have to restart
the Taskserver in order for the CRL to be effective.
However, by using the declarative configuration we now get this for
free, because removing a user will also restart the service and thus its
client certificate will end up in the CRL.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Unfortunately we don't have a better way to check whether the reload has
been done successfully, but at least we now *can* reload it without
figuring out the exact signal to send to the process.
Note that on reload, Taskserver will not reload the CRL file. For that
to work, a full restart needs to be done.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
If we want to revoke client certificates and want the server to actually
notice the revocation, we need to have a valid certificate revocation
list.
Right now the expiration_days is set to 10 years, but that's merely to
actually get certtool to actually generate the CRL without trying to
prompt for user input.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
It doesn't do much harm to make the server certificate world readable,
because even though it's not accessible anymore via the file system,
someone can still get it by simply doing a TLS handshake with the
server.
So this is solely for consistency.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>