Description of problem: Suddenly all of our processes, which are using virt-customize to customize OpenStack images, are failing while running the following command: virt-customize --verbose --selinux-relabel --update -a overcloud-full.qcow2 The error is: --------------------------------------- Updating : selinux-policy-3.13.1-229.el7_6.6.noarch 33/392 Updating : selinux-policy-targeted-3.13.1-229.el7_6.6.noarch 34/392 warning: %post(selinux-policy-targeted-3.13.1-229.el7_6.6.noarch) scriptlet failed, signal 9 Non-fatal POSTIN scriptlet failure in rpm package selinux-policy-targeted-3.13.1-229.el7_6.6.noarch virt-customize: error: yum -y update: command exited with an error If reporting bugs, run virt-customize with debugging enabled and include the complete output: virt-customize -v -x [...] --------------------------------------- We also see these lines in audit.log: ---------------------------------------- type=AVC msg=audit(1543830076.355:1696): avc: denied { read } for pid=27422 comm="inet_gethost" name="unix" dev="proc" ino=4026532003 scontext=system_u:system_r:rabbitmq_t:s0 tcontext=system_u:object_r:proc_net_t:s0 tclass=file permissive=0 type=AVC msg=audit(1543830079.407:1697): avc: denied { read } for pid=27607 comm="inet_gethost" name="unix" dev="proc" ino=4026532003 scontext=system_u:system_r:rabbitmq_t:s0 tcontext=system_u:object_r:proc_net_t:s0 tclass=file permissive=0 type=AVC msg=audit(1543831459.775:3416): avc: denied { search } for pid=10398 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c135,c220 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0 type=AVC msg=audit(1543831464.303:3442): avc: denied { search } for pid=10535 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c550,c710 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0 type=AVC msg=audit(1543831468.370:3468): avc: denied { search } for pid=10664 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c617,c1000 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0 type=AVC msg=audit(1543831512.549:3494): avc: denied { search } for pid=10785 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c45,c269 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0 type=AVC msg=audit(1543831667.643:3520): avc: denied { search } for pid=11031 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c173,c282 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0 ----------------------------------------------------------------- Version-Release number of selected component (if applicable): 12 (2018-11-28.1) How reproducible: 100% Steps to Reproduce: 1. Update packages in overcloud image with virt-customize
Asked SElinux team to review this BZ. It might indicate an issue with SElinux and may be related to the recent SElinux issues (like BZ 1640528).
(In reply to Arie Bregman from comment #0) > Description of problem: > > Suddenly all of our processes, which are using virt-customize to customize > OpenStack images, are failing while running the following command: > > virt-customize --verbose --selinux-relabel --update -a overcloud-full.qcow2 > > The error is: > > --------------------------------------- > Updating : selinux-policy-3.13.1-229.el7_6.6.noarch > 33/392 > Updating : selinux-policy-targeted-3.13.1-229.el7_6.6.noarch > 34/392 > warning: %post(selinux-policy-targeted-3.13.1-229.el7_6.6.noarch) scriptlet > failed, signal 9 That "signal 9" is SIGKILL, which is "used to cause immediate program termination. It cannot be handled or ignored, and is therefore always fatal. It is also not possible to block this signal". Some questions: - What, precisely, is causing the SIGKILL here? - Is this consistently reproducible? - Can we get a botched disk image, so that we can re-run the `yum` transaction to see what is causing the failure > Non-fatal POSTIN scriptlet failure in rpm package > selinux-policy-targeted-3.13.1-229.el7_6.6.noarch > virt-customize: error: yum -y update: command exited with an error A side not: instead of `yum -y update`, you want: `yum -y update || true`-- that way, if a command is permitted to fail, the script will proceed further, instead of aborting. [...]
You might also try replacing --update temporarily with a full command, eg: virt-customize [...] --run-command "yum -y -d 10 -v update" [...] so you can see what's really going on with yum/rpm.
You can if you like replace the --update flag with: --run-command 'yum -y update --skip-broken'
I'm unable to reproduce the problem. Here are my three attempts (after some trial-and-error) at reproducers. All of them succeed Method-A1 --------- Get a fresh copy of 'overcloud-full.qcow2', and now try to run (notice the "--skip-broken"): $ time virt-customize -v -x --selinux-relabel \ --run-command 'yum -y update --skip-broken -d 10 -v' \ -a v4-overcloud-full.qcow2 \ |& tee method-A1.txt The `yum update` update succeeds. The full log is in attachment. Method-A2 --------- Again, get a fresh copy of 'overcloud-full.qcow2', and now try to run (here we run *without* the "--skip-broken"): $ time virt-customize -v -x --selinux-relabel \ --run-command 'yum -y update -d 10 -v' \ -a v4-overcloud-full.qcow2 \ |& tee method-A2.txt Here, too. `yum update` update succeeds. The full log is in attachment. Method-B -------- Trying to make `yum` run manually from the same OverCloud image in the enviornment Michal gave me. (1) On the UnderCloud environment, navigate to the "backup-overcloud" directory, and copy the QCOW2 file, 'vmlinuz' and 'initrd' into a directory called "v3" $ mkdir /home/stack/v3 $ cd /home/stack/backup-overcloud/ $ cp overcloud-full.qcow2 ~/v3/v3-overcloud-full.qcow2 $ cp overcloud-full.vmlinuz overcloud-full.initrd ~/v3/ (2) Reset the root password to "empty" on the 'v3-overcloud-full.qcow2' $ cd /home/stack/v3/ $ virt-edit -a v3-overcloud-full.qcow2 /etc/passwd -e 's/^root:.*?:/root::/' (2) Then, import the 'v3-overcloud-full.qcow2' image (with the given 'kernel' and 'initrd' into libvirt: $ sudo virt-install --name v3-overcloud-full-with-kernel \ --ram 2048 --disk path=./v3-overcloud-full.qcow2,format=qcow2 \ --machine q35 --os-variant fedora27 --cpu host-passthrough \ --nographics --network default \ --boot kernel=`pwd`/overcloud-full.vmlinuz,initrd=`pwd`/overcloud-full.initrd,kernel_args="panic=1 console=ttyS0 root=/dev/vda selinux=0" \ --import (3) Then, SSH into the guest, and run: $ yum update -y -d 10 -v The `yum update` succeeds. The full log is in attachment. * * *
Created attachment 1512538 [details] Log of method-A1 (`virt-customize` without "yum --skip-broken")
Created attachment 1512594 [details] Log of method-A2 (`virt-customize` with "yum --skip-broken")
Created attachment 1512595 [details] Log of method-B (interactive run of `yum update -y -d 10 -v`)
I've tried to apply work-around to first perform update with --skip-broken and then without, however now I am seeing other type of errors: Cannot allocate memory Non-fatal POSTIN scriptlet failure in rpm package fence-agents-common-4.2.1-11.el7_6.1.x86_64 -- error: Couldn't fork %post(cronie-1.4.11-20.el7_6.x86_64): Cannot allocate memory Non-fatal POSTIN scriptlet failure in rpm package cronie-1.4.11-20.el7_6.x86_64 error: Couldn't fork %triggerin(cronie-1.4.11-20.el7_6.x86_64): Cannot allocate memory Non-fatal <unknown> scriptlet failure in rpm package cronie-1.4.11-20.el7_6.x86_64 Updating : cronie-anacron-1.4.11-20.el7_6.x86_64 38/286 error: Couldn't fork %post(cronie-anacron-1.4.11-20.el7_6.x86_64): Cannot allocate memory Non-fatal POSTIN scriptlet failure in rpm package cronie-anacron-1.4.11-20.el7_6.x86_64 This looks similar to RDO bug: https://bugs.launchpad.net/tripleo/+bug/1718965
This case was a virt-customize in a CI environment using a method that is not recommended. It's a RHEL policy issue, not openstack-selinux. Given that it's CI, we can either run in permissive for that command, or adjust to use the supported backend.