Bug 1655542

Summary:

Unable to execute virt-customize

Product:

Red Hat OpenStack

Reporter:

Arie Bregman <abregman>

Component:

openstack-selinux

Assignee:

Zoli Caplovic <zcaplovi>

Status:

CLOSED WONTFIX

QA Contact:

Jon Schlueter <jschluet>

Severity:

urgent

Docs Contact:

Priority:

urgent

Version:

12.0 (Pike)

CC:

abregman, berrange, dsariel, jschluet, kchamart, lhh, lvrabec, mburns, mgrepl, mpryc, ptoscano, rjones

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-01-14 21:29:59 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Log of method-A1 (`virt-customize` without "yum --skip-broken")	none
Log of method-A2 (`virt-customize` with "yum --skip-broken")	none
Log of method-B (interactive run of `yum update -y -d 10 -v`)	none

Description Arie Bregman 2018-12-03 12:11:49 UTC

Description of problem:

Suddenly all of our processes, which are using virt-customize to customize OpenStack images, are failing while running the following command:

virt-customize --verbose --selinux-relabel --update -a overcloud-full.qcow2

The error is:

---------------------------------------
  Updating   : selinux-policy-3.13.1-229.el7_6.6.noarch                  33/392 
  Updating   : selinux-policy-targeted-3.13.1-229.el7_6.6.noarch         34/392 
warning: %post(selinux-policy-targeted-3.13.1-229.el7_6.6.noarch) scriptlet failed, signal 9
Non-fatal POSTIN scriptlet failure in rpm package selinux-policy-targeted-3.13.1-229.el7_6.6.noarch
virt-customize: error: yum -y update: command exited with an error

If reporting bugs, run virt-customize with debugging enabled and include 
the complete output:

  virt-customize -v -x [...]

---------------------------------------

We also see these lines in audit.log:

----------------------------------------
type=AVC msg=audit(1543830076.355:1696): avc:  denied  { read } for  pid=27422 comm="inet_gethost" name="unix" dev="proc" ino=4026532003 scontext=system_u:system_r:rabbitmq_t:s0 tcontext=system_u:object_r:proc_net_t:s0 tclass=file permissive=0
type=AVC msg=audit(1543830079.407:1697): avc:  denied  { read } for  pid=27607 comm="inet_gethost" name="unix" dev="proc" ino=4026532003 scontext=system_u:system_r:rabbitmq_t:s0 tcontext=system_u:object_r:proc_net_t:s0 tclass=file permissive=0
type=AVC msg=audit(1543831459.775:3416): avc:  denied  { search } for  pid=10398 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c135,c220 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0
type=AVC msg=audit(1543831464.303:3442): avc:  denied  { search } for  pid=10535 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c550,c710 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0
type=AVC msg=audit(1543831468.370:3468): avc:  denied  { search } for  pid=10664 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c617,c1000 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0
type=AVC msg=audit(1543831512.549:3494): avc:  denied  { search } for  pid=10785 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c45,c269 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0
type=AVC msg=audit(1543831667.643:3520): avc:  denied  { search } for  pid=11031 comm="qemu-kvm" name="10091" dev="proc" ino=241448 scontext=unconfined_u:unconfined_r:svirt_t:s0:c173,c282 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0
-----------------------------------------------------------------

Version-Release number of selected component (if applicable): 12 (2018-11-28.1)


How reproducible: 100%


Steps to Reproduce:
1. Update packages in overcloud image with virt-customize

Comment 2 Zoli Caplovic 2018-12-03 16:50:00 UTC

Asked SElinux team to review this BZ. It might indicate an issue with SElinux and may be related to the recent SElinux issues (like BZ 1640528).

Comment 7 Kashyap Chamarthy 2018-12-06 14:46:43 UTC

(In reply to Arie Bregman from comment #0)
> Description of problem:
> 
> Suddenly all of our processes, which are using virt-customize to customize
> OpenStack images, are failing while running the following command:
> 
> virt-customize --verbose --selinux-relabel --update -a overcloud-full.qcow2
> 
> The error is:
> 
> ---------------------------------------
>   Updating   : selinux-policy-3.13.1-229.el7_6.6.noarch                 
> 33/392 
>   Updating   : selinux-policy-targeted-3.13.1-229.el7_6.6.noarch        
> 34/392 
> warning: %post(selinux-policy-targeted-3.13.1-229.el7_6.6.noarch) scriptlet
> failed, signal 9

That "signal 9" is SIGKILL, which is "used to cause immediate program termination. It cannot be handled or ignored, and is therefore always fatal. It is also not possible to block this signal". 

Some questions:

  - What, precisely, is causing the SIGKILL here?

  - Is this consistently reproducible?

  - Can we get a botched disk image, so that we can re-run 
    the `yum` transaction to see what is causing the failure

> Non-fatal POSTIN scriptlet failure in rpm package
> selinux-policy-targeted-3.13.1-229.el7_6.6.noarch
> virt-customize: error: yum -y update: command exited with an error

A side not: instead of `yum -y update`, you want: `yum -y update || true`-- that way, if a command is permitted to fail, the script will proceed further, instead of aborting.

[...]

Comment 8 Richard W.M. Jones 2018-12-06 14:57:38 UTC

You might also try replacing --update temporarily with a full command, eg:

  virt-customize [...] --run-command "yum -y -d 10 -v update" [...]

so you can see what's really going on with yum/rpm.

Comment 12 Richard W.M. Jones 2018-12-06 22:40:46 UTC

You can if you like replace the --update flag with:

  --run-command 'yum -y update --skip-broken'

Comment 13 Kashyap Chamarthy 2018-12-07 13:55:39 UTC

I'm unable to reproduce the problem.  Here are my three attempts (after
some trial-and-error) at reproducers.  All of them succeed

Method-A1
---------

Get a fresh copy of 'overcloud-full.qcow2', and now try to run (notice
the "--skip-broken"):

    $ time virt-customize -v -x --selinux-relabel \
        --run-command 'yum -y update --skip-broken -d 10 -v' \
        -a v4-overcloud-full.qcow2 \
        |& tee method-A1.txt

The `yum update` update succeeds.  The full log is in attachment.


Method-A2
---------

Again, get a fresh copy of 'overcloud-full.qcow2', and now try to run
(here we run *without* the "--skip-broken"):


    $ time virt-customize -v -x --selinux-relabel \
        --run-command 'yum -y update -d 10 -v' \
        -a v4-overcloud-full.qcow2 \
        |& tee method-A2.txt

Here, too.  `yum update` update succeeds.  The full log is in
attachment.


Method-B
--------

Trying to make `yum` run manually from the same OverCloud image in the
enviornment Michal gave me.

(1) On the UnderCloud environment, navigate to the "backup-overcloud"
    directory, and copy the QCOW2 file, 'vmlinuz' and 'initrd' into a
    directory called "v3"

    $ mkdir /home/stack/v3
    $ cd /home/stack/backup-overcloud/
    $ cp overcloud-full.qcow2 ~/v3/v3-overcloud-full.qcow2
    $ cp overcloud-full.vmlinuz overcloud-full.initrd ~/v3/

(2) Reset the root password to "empty" on the 'v3-overcloud-full.qcow2'

    $ cd /home/stack/v3/
    $ virt-edit -a v3-overcloud-full.qcow2 /etc/passwd -e 's/^root:.*?:/root::/'

(2) Then, import the 'v3-overcloud-full.qcow2' image (with the given
    'kernel' and 'initrd' into libvirt:

    $ sudo virt-install --name v3-overcloud-full-with-kernel \
    --ram 2048 --disk path=./v3-overcloud-full.qcow2,format=qcow2  \
    --machine q35 --os-variant fedora27  --cpu host-passthrough \
    --nographics --network default  \
    --boot kernel=`pwd`/overcloud-full.vmlinuz,initrd=`pwd`/overcloud-full.initrd,kernel_args="panic=1 console=ttyS0  root=/dev/vda selinux=0" \
    --import 

(3) Then, SSH into the guest, and run:

    $ yum update -y -d 10 -v


The `yum update` succeeds.  The full log is in attachment.

        * * *

Comment 14 Kashyap Chamarthy 2018-12-07 14:00:33 UTC

Created attachment 1512538 [details]
Log of method-A1  (`virt-customize` without "yum --skip-broken")

Comment 15 Kashyap Chamarthy 2018-12-07 15:05:44 UTC

Created attachment 1512594 [details]
Log of method-A2 (`virt-customize` with "yum --skip-broken")

Comment 16 Kashyap Chamarthy 2018-12-07 15:07:16 UTC

Created attachment 1512595 [details]
Log of method-B (interactive run of `yum update  -y -d 10 -v`)

Comment 17 Michal Pryc 2018-12-11 13:35:33 UTC

I've tried to apply work-around to first perform update with --skip-broken and then without, however now I am seeing other type of errors:

Cannot allocate memory
Non-fatal POSTIN scriptlet failure in rpm package fence-agents-common-4.2.1-11.el7_6.1.x86_64

-- 
error: Couldn't fork %post(cronie-1.4.11-20.el7_6.x86_64): Cannot allocate memory
Non-fatal POSTIN scriptlet failure in rpm package cronie-1.4.11-20.el7_6.x86_64
error: Couldn't fork %triggerin(cronie-1.4.11-20.el7_6.x86_64): Cannot allocate memory
Non-fatal <unknown> scriptlet failure in rpm package cronie-1.4.11-20.el7_6.x86_64
  Updating   : cronie-anacron-1.4.11-20.el7_6.x86_64                     38/286 
error: Couldn't fork %post(cronie-anacron-1.4.11-20.el7_6.x86_64): Cannot allocate memory
Non-fatal POSTIN scriptlet failure in rpm package cronie-anacron-1.4.11-20.el7_6.x86_64


This looks similar to RDO bug:
  https://bugs.launchpad.net/tripleo/+bug/1718965

Comment 26 Mike Burns 2019-01-14 21:29:59 UTC

This case was a virt-customize in a CI environment using a method that is not recommended.  It's a RHEL policy issue, not openstack-selinux.  

Given that it's CI, we can either run in permissive for that command, or adjust to use the supported backend.