From 67223ee205364afb203361b134f16b890c4d726c Mon Sep 17 00:00:00 2001 From: aszlig Date: Fri, 6 May 2016 12:36:58 +0200 Subject: nixos/stage-1: Don't kill kernel threads Unfortunately, pkill doesn't distinguish between kernel and user space processes, so we need to make sure we don't accidentally kill kernel threads. Normally, a kernel thread ignores all signals, but there are a few that do. A quick grep on the kernel source tree (as of kernel 4.6.0) shows the following source files which use allow_signal(): drivers/isdn/mISDN/l1oip_core.c drivers/md/md.c drivers/misc/mic/cosm/cosm_scif_server.c drivers/misc/mic/cosm_client/cosm_scif_client.c drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c drivers/staging/rtl8188eu/core/rtw_cmd.c drivers/staging/rtl8712/rtl8712_cmd.c drivers/target/iscsi/iscsi_target.c drivers/target/iscsi/iscsi_target_login.c drivers/target/iscsi/iscsi_target_nego.c drivers/usb/atm/usbatm.c drivers/usb/gadget/function/f_mass_storage.c fs/jffs2/background.c fs/lockd/clntlock.c fs/lockd/svc.c fs/nfs/nfs4state.c fs/nfsd/nfssvc.c While not all of these are necessarily kthreads and some functionality may still be unimpeded, it's still quite harmful and can cause unexpected side-effects, especially because some of these kthreads are storage-related (which we obviously don't want to kill during bootup). During discussion at #15226, @dezgeg suggested the following implementation: for pid in $(pgrep -v -f '@'); do if [ "$(cat /proc/$pid/cmdline)" != "" ]; then kill -9 "$pid" fi done This has a few downsides: * User space processes which use an empty string in their command line won't be killed. * It results in errors during bootup because some shell-related processes are already terminated (maybe it's pgrep itself, haven't checked). * The @ is searched within the full command line, not just at the beginning of the string. Of course, we already had this until now, so it's not a problem of his implementation. I posted an alternative implementation which doesn't suffer from the first point, but even that one wasn't sufficient: for pid in $(pgrep -v -f '^@'); do readlink "/proc/$pid/exe" &> /dev/null || continue echo "$pid" done | xargs kill -9 This one spawns a subshell, which would be included in the processes to kill and actually kills itself during the process. So what we have now is even checking whether the shell process itself is in the list to kill and avoids killing it just to be sure. Also, we don't spawn a subshell anymore and use /proc/$pid/exe to distinguish between user space and kernel processes like in the comments of the following StackOverflow answer: http://stackoverflow.com/a/12231039 We don't need to take care of terminating processes, because what we actually want IS to terminate the processes. The only point where this (and any previous) approach falls short if we have processes that act like fork bombs, because they might spawn additional processes between the pgrep and the killing. We can only address this with process/control groups and this still won't save us because the root user can escape from that as well. Signed-off-by: aszlig Fixes: #15226 --- nixos/modules/system/boot/stage-1-init.sh | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) (limited to 'nixos/modules/system') diff --git a/nixos/modules/system/boot/stage-1-init.sh b/nixos/modules/system/boot/stage-1-init.sh index 1f8779abf0c3..9bffcd31b9b4 100644 --- a/nixos/modules/system/boot/stage-1-init.sh +++ b/nixos/modules/system/boot/stage-1-init.sh @@ -439,8 +439,18 @@ eval "exec $logOutFd>&- $logErrFd>&-" # Kill any remaining processes, just to be sure we're not taking any # with us into stage 2. But keep storage daemons like unionfs-fuse. -pkill -9 -v -f '@' - +# +# Storage daemons are distinguished by an @ in front of their command line: +# https://www.freedesktop.org/wiki/Software/systemd/RootStorageDaemons/ +local pidsToKill="$(pgrep -v -f '^@')" +for pid in $pidsToKill; do + # Make sure we don't kill kernel processes, see #15226 and: + # http://stackoverflow.com/questions/12213445/identifying-kernel-threads + readlink "/proc/$pid/exe" &> /dev/null || continue + # Try to avoid killing ourselves. + [ $pid -eq $$ ] && continue + kill -9 "$pid" +done if test -n "$debug1mounts"; then fail; fi -- cgit 1.4.1