Bug 1307080

Summary: nspawn catches kill signal only when using jenkins
Product: Red Hat Enterprise Linux 7 Reporter: John Walter <jwalter>
Component: systemdAssignee: systemd-maint
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: fsumsal, jscotka, lnykryn, matthew, systemd-maint-list, tfrazier
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-219-22.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 00:51:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Walter 2016-02-12 16:14:11 UTC
Description of problem:
When running 217 on arch linux everything works exactly as expected, but after upgrading to 218 nspawn will always report "Container root terminated by signal KILL." when it is run by jenkins, but if I run it with the exact same environment through ssh it succeeds. I've been trying all day to find a way to reproduce the issues outside of jenkins, but I have been unable to.

I've tried
217: Works
218: Does not work
master: Does not work

Version-Release number of selected component (if applicable):
RHEL 7.2

How reproducible:
Very

Steps to Reproduce:
1. sudo /usr/bin/linux32 /usr/bin/systemd-nspawn --capability=CAP_MKNOD -D /srv/jenkins/stage_x86 /tmp/make-exherbo-stages

Spawning container stage_x86 on /srv/jenkins/stage_x86.

2. Press ^] three times within 1s to kill container.

Actual results:

Container stage_x86 terminated by signal KILL.
This happens for *every* single job in Jenkins which is a huge problem.


Expected results:
It should build packages for arch linux using the mkchrootpkg command provided by the arch devtools, which in turn uses systemd-nspawn in order to create a clean environment for building the package in.


Additional info:
Bug has a fix already: http://cgit.freedesktop.org/systemd/systemd/commit/?id=9c857b9d160c10b4454fc9f83442c1878343422f

external bug: https://bugs.freedesktop.org/show_bug.cgi?id=87732

Comment 1 Lukáš Nykrýn 2016-02-12 16:18:22 UTC
The patch is really easy to backport, so devel_ack+

Comment 2 Matthew Gyurgyik 2016-02-12 19:37:04 UTC
Since the original report doesn't make it clear, this issue impacts RHEL7.2, running systemd-219-19.el7.x86_64.

Example:
[root@jenkins-slave2 ~]# cat /var/lib/jenkins/nspawn/run.sh
#!/bin/bash
lxc_root="$(pwd)/nspawn-lxc"
systemd-nspawn -q -D "$lxc_root" --bind /var/lib/jenkins "$@"


Before patch:

+ sudo /var/lib/jenkins/nspawn/run.sh yum repolist
Container nspawn-lxc-rhel7-x86_64 terminated by signal KILL.

After patch

+ sudo /var/lib/jenkins/nspawn/run.sh yum repolist
repo id                              repo name                            status
optional                             optional                             4385
os                                   os                                   4656
supplementary                        supplementary                          16
repolist: 9057

Comment 4 Lukáš Nykrýn 2016-05-24 11:54:34 UTC
pushed to staging -> https://github.com/lnykryn/systemd-rhel/commit/98e5c02b1602eaaac5c63045fa7a06e40249445e -> post

Comment 6 Frantisek Sumsal 2016-09-02 13:34:58 UTC
Verified with systemd-219-27.el7.

Old package:
# echo test | systemd-nspawn -D 'cont.TbCGW' dmesg
Spawning container cont.TbCGW on /tmp/tmp.yYz3EA2QW3/cont.TbCGW.
Press ^] three times within 1s to kill container.
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
<...>
Container cont.TbCGW terminated by signal KILL.
# echo $?
1

New package:
# echo test | systemd-nspawn -D 'cont.CSnRn' dmesg
Spawning container cont.CSnRn on /tmp/tmp.CGEoINzReM/cont.CSnRn.
Press ^] three times within 1s to kill container.
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
<...>
Container cont.CSnRn exited successfully
# echo $?
0

Comment 8 errata-xmlrpc 2016-11-04 00:51:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2216.html