Bug 1655542
Summary: | Unable to execute virt-customize | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Arie Bregman <abregman> |
Component: | openstack-selinux | Assignee: | Zoli Caplovic <zcaplovi> |
Status: | CLOSED WONTFIX | QA Contact: | Jon Schlueter <jschluet> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 12.0 (Pike) | CC: | abregman, berrange, dsariel, jschluet, kchamart, lhh, lvrabec, mburns, mgrepl, mpryc, ptoscano, rjones |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-01-14 21:29:59 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Arie Bregman
2018-12-03 12:11:49 UTC
Asked SElinux team to review this BZ. It might indicate an issue with SElinux and may be related to the recent SElinux issues (like BZ 1640528). (In reply to Arie Bregman from comment #0) > Description of problem: > > Suddenly all of our processes, which are using virt-customize to customize > OpenStack images, are failing while running the following command: > > virt-customize --verbose --selinux-relabel --update -a overcloud-full.qcow2 > > The error is: > > --------------------------------------- > Updating : selinux-policy-3.13.1-229.el7_6.6.noarch > 33/392 > Updating : selinux-policy-targeted-3.13.1-229.el7_6.6.noarch > 34/392 > warning: %post(selinux-policy-targeted-3.13.1-229.el7_6.6.noarch) scriptlet > failed, signal 9 That "signal 9" is SIGKILL, which is "used to cause immediate program termination. It cannot be handled or ignored, and is therefore always fatal. It is also not possible to block this signal". Some questions: - What, precisely, is causing the SIGKILL here? - Is this consistently reproducible? - Can we get a botched disk image, so that we can re-run the `yum` transaction to see what is causing the failure > Non-fatal POSTIN scriptlet failure in rpm package > selinux-policy-targeted-3.13.1-229.el7_6.6.noarch > virt-customize: error: yum -y update: command exited with an error A side not: instead of `yum -y update`, you want: `yum -y update || true`-- that way, if a command is permitted to fail, the script will proceed further, instead of aborting. [...] You might also try replacing --update temporarily with a full command, eg: virt-customize [...] --run-command "yum -y -d 10 -v update" [...] so you can see what's really going on with yum/rpm. You can if you like replace the --update flag with: --run-command 'yum -y update --skip-broken' I'm unable to reproduce the problem. Here are my three attempts (after some trial-and-error) at reproducers. All of them succeed Method-A1 --------- Get a fresh copy of 'overcloud-full.qcow2', and now try to run (notice the "--skip-broken"): $ time virt-customize -v -x --selinux-relabel \ --run-command 'yum -y update --skip-broken -d 10 -v' \ -a v4-overcloud-full.qcow2 \ |& tee method-A1.txt The `yum update` update succeeds. The full log is in attachment. Method-A2 --------- Again, get a fresh copy of 'overcloud-full.qcow2', and now try to run (here we run *without* the "--skip-broken"): $ time virt-customize -v -x --selinux-relabel \ --run-command 'yum -y update -d 10 -v' \ -a v4-overcloud-full.qcow2 \ |& tee method-A2.txt Here, too. `yum update` update succeeds. The full log is in attachment. Method-B -------- Trying to make `yum` run manually from the same OverCloud image in the enviornment Michal gave me. (1) On the UnderCloud environment, navigate to the "backup-overcloud" directory, and copy the QCOW2 file, 'vmlinuz' and 'initrd' into a directory called "v3" $ mkdir /home/stack/v3 $ cd /home/stack/backup-overcloud/ $ cp overcloud-full.qcow2 ~/v3/v3-overcloud-full.qcow2 $ cp overcloud-full.vmlinuz overcloud-full.initrd ~/v3/ (2) Reset the root password to "empty" on the 'v3-overcloud-full.qcow2' $ cd /home/stack/v3/ $ virt-edit -a v3-overcloud-full.qcow2 /etc/passwd -e 's/^root:.*?:/root::/' (2) Then, import the 'v3-overcloud-full.qcow2' image (with the given 'kernel' and 'initrd' into libvirt: $ sudo virt-install --name v3-overcloud-full-with-kernel \ --ram 2048 --disk path=./v3-overcloud-full.qcow2,format=qcow2 \ --machine q35 --os-variant fedora27 --cpu host-passthrough \ --nographics --network default \ --boot kernel=`pwd`/overcloud-full.vmlinuz,initrd=`pwd`/overcloud-full.initrd,kernel_args="panic=1 console=ttyS0 root=/dev/vda selinux=0" \ --import (3) Then, SSH into the guest, and run: $ yum update -y -d 10 -v The `yum update` succeeds. The full log is in attachment. * * * Created attachment 1512538 [details]
Log of method-A1 (`virt-customize` without "yum --skip-broken")
Created attachment 1512594 [details]
Log of method-A2 (`virt-customize` with "yum --skip-broken")
Created attachment 1512595 [details]
Log of method-B (interactive run of `yum update -y -d 10 -v`)
I've tried to apply work-around to first perform update with --skip-broken and then without, however now I am seeing other type of errors: Cannot allocate memory Non-fatal POSTIN scriptlet failure in rpm package fence-agents-common-4.2.1-11.el7_6.1.x86_64 -- error: Couldn't fork %post(cronie-1.4.11-20.el7_6.x86_64): Cannot allocate memory Non-fatal POSTIN scriptlet failure in rpm package cronie-1.4.11-20.el7_6.x86_64 error: Couldn't fork %triggerin(cronie-1.4.11-20.el7_6.x86_64): Cannot allocate memory Non-fatal <unknown> scriptlet failure in rpm package cronie-1.4.11-20.el7_6.x86_64 Updating : cronie-anacron-1.4.11-20.el7_6.x86_64 38/286 error: Couldn't fork %post(cronie-anacron-1.4.11-20.el7_6.x86_64): Cannot allocate memory Non-fatal POSTIN scriptlet failure in rpm package cronie-anacron-1.4.11-20.el7_6.x86_64 This looks similar to RDO bug: https://bugs.launchpad.net/tripleo/+bug/1718965 This case was a virt-customize in a CI environment using a method that is not recommended. It's a RHEL policy issue, not openstack-selinux. Given that it's CI, we can either run in permissive for that command, or adjust to use the supported backend. |