| Summary: | SELinux context problem on pre generated overcloud images | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Raoul Scarazzini <rscarazz> | |
| Component: | rhosp-director | Assignee: | Ben Nemec <bnemec> | |
| Status: | CLOSED NOTABUG | QA Contact: | yeylon <yeylon> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 8.0 (Liberty) | CC: | ahirshbe, bnemec, hbrock, jslagle, mburns, michele, oblaut, rhel-osp-director-maint, roxenham, rscarazz, srevivo, ukalifon | |
| Target Milestone: | ga | Keywords: | Reopened, Triaged | |
| Target Release: | 8.0 (Liberty) | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1301050 (view as bug list) | Environment: | ||
| Last Closed: | 2016-03-22 07:46:06 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1301050 | |||
|
Description
Raoul Scarazzini
2015-12-23 15:50:34 UTC
This does not appear to be an image problem. I just deployed the 12-3 images (which is what the latest link in the OP takes me to) with the latest OSPd 8 puddle and /etc/machine-id on the deployed nodes is fine: [root@overcloud-controller-0 etc]# ls -lZ machine-id -r--r--r--. root root unconfined_u:object_r:machineid_t:s0 machine-id So I don't know what is causing this, but it isn't anything in the image build or I should be seeing the problem in my deployment too. What exactly is being done to enable instance ha? I would look to that at this point. Not sure it's an image problem either. Just deployed an OSP7 environment with Packstack and had the same issue. Issuing a 'restorecon /etc/machine-id' allowed systemd-journal to start, and then I was able to start systemd-journald as well as run httpd/openstack-dashboard which was originally failing as per.. https://bugzilla.redhat.com/show_bug.cgi?id=1300800 *** Bug 1301050 has been marked as a duplicate of this bug. *** (In reply to Ben Nemec from comment #2) [...] > So I don't know what is causing this, but it isn't anything in the image > build or I should be seeing the problem in my deployment too. What exactly > is being done to enable instance ha? I would look to that at this point. Enabling Instance ha (like described here https://access.redhat.com/articles/1544823) could not affect the context of that file, since all the steps are just about configurations. No new package is installed, no selinux context is changed anywhere. The only thing that is introduced is the fence, and this means that nodes could be rebooted, and if you reboot with the wrong context you get the problem. As reported on the clone of this bug this problem is also present in a Mitaka environment with images generated from scratch: [root@overcloud-controller-1 ~]# ls -lZ /etc/machine-id -rw-r--r--. root root system_u:object_r:unlabeled_t:s0 /etc/machine-id Note that on this environment I wasn't able to complete an overcloud deploy (it fails for other reasons), but once the controllers and computes comes up the problem is there. So I don't think it's something due to overcloud deployment, the image born with this problem already. (In reply to Raoul Scarazzini from comment #6) > As reported on the clone of this bug this problem is also present in a > Mitaka environment with images generated from scratch: > > [root@overcloud-controller-1 ~]# ls -lZ /etc/machine-id > -rw-r--r--. root root system_u:object_r:unlabeled_t:s0 /etc/machine-id > > Note that on this environment I wasn't able to complete an overcloud deploy > (it fails for other reasons), but once the controllers and computes comes up > the problem is there. So I don't think it's something due to overcloud > deployment, the image born with this problem already. I'm not seeing that in my upstream images. Is there any chance you could upload your exact overcloud-full.qcow2 somewhere so I can pull it down and try it myself? FWIW, when I mount a locally built upstream image I see the following: bnemec@RedHat ~]$ ls -lZ /mnt/temp/etc/machine-id -r--r--r--. 1 root root unconfined_u:object_r:machineid_t:s0 33 Mar 31 2015 /mnt/temp/etc/machine-id I see the same thing after the nodes are deployed, so I don't think anything bad is happening during deployment either. Hey Ben, this is the (CentOS, since it's taken from a Mitaka deployment) image with the wrong /etc/machine_id context: http://file.rdu.redhat.com/rscarazz/overcloud-full.qcow2 I've tried to use this image locally by creating a vm (Just for information: how can you mount a qcow image and viewing the Selinux context?) and the context is unlabeled_t, so it is wrong. *** Bug 1305486 has been marked as a duplicate of this bug. *** Okay, that's very strange. I can also see that the selinux context is broken in that image. Where did you get this file? Is it built locally or did you download it from somewhere? FWIW, when I mount the image this is what I see: [bnemec@RedHat ~]$ ls -lZ /mnt/temp/etc/machine-id -rw-r--r-- 1 root root ? 0 Feb 4 03:45 /mnt/temp/etc/machine-id Which makes me think the selinux context got dropped entirely when the image was built. I don't see that in either my current upstream images or the latest OSP 8 puddle images. I use a process like http://blog.loftninjas.org/2008/10/27/mounting-kvm-qcow2-qemu-disk-images/ to mount qcows. Hi Ben, the image is generated on-the-flight after installing the undercloud, using the standard procedure: overcloud.export NODE_DIST=centos7 export USE_DELOREAN_TRUNK=1 export DELOREAN_TRUNK_REPO="http://trunk.rdoproject.org/centos7/current-tripleo/" export DELOREAN_REPO_FILE="delorean.repo" openstack overcloud image build --all Now it's clear also how you verified the context on the image. To be sure from my side, as I wrote, I created a vm with this image and saw the wrong context (unlabeled_t) on it. Let me add that before using the image I do an additional thing on the image, resetting the root password: virt-sysprep --root-password password:redhat -a overcloud-full.qcow2 Do you think this could affect in some way that specific selinux context? [ 68.4] Performing "machine-id" ... Ding, ding, ding, we have a winner. :-) virt-sysprep wasn't a problem on my F23 laptop, but when I ran it from CentOS it broke the SELinux context exactly as described here. I'm guessing there was a bug in the version of virt-sysprep in CentOS that causes this. That's great, so do I need to create a bug against libguestfs-tools for this? This command caused problems on both CentOS and RHEL. I would think so, yes. Like I said, the version on F23 is already fixed, so it's probably a question of backporting the fix to the EL7 version. Hi Ben, here it is: https://bugzilla.redhat.com/show_bug.cgi?id=1308997 I think we can close this bug as not a bug, at least not a bug of the director. I just encountered this bug on RHEL-OSP director 8.0 puddle - 2016-03-11.1 The deployment failed at the end. [root@overcloud-controller-2 ~]# cd /var/log [root@overcloud-controller-2 log]# less messages Mar 16 09:38:15 localhost rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="712" x-info="http://www.rsyslog.com"] start Mar 16 09:38:15 localhost rsyslogd-2307: warning: ~ action is deprecated, consider using the 'stop' statement instead [try http://www.rsyslog.com/e/2307 ] Mar 16 09:38:15 localhost rsyslogd-2307: warning: ~ action is deprecated, consider using the 'stop' statement instead [try http://www.rsyslog.com/e/2307 ] [root@overcloud-controller-2 log]# ls -ltrZ /etc/machine-id -rw-r--r--. root root system_u:object_r:unlabeled_t:s0 /etc/machine-id [root@overcloud-controller-2 log]# restorecon -v /etc/machine-id restorecon reset /etc/machine-id context system_u:object_r:unlabeled_t:s0->system_u:object_r:machineid_t:s0 [root@overcloud-controller-2 log]# Asaf are you using virt-sysprep on the overcloud images for some reason? If so, then you need to use instead virt-customize. See https://bugzilla.redhat.com/show_bug.cgi?id=1308997 for all the explanations. Raoul, yes: virt-sysprep --root-password password:12345678 -a overcloud-full.qcow2 virt-customize -a overcloud-full.qcow2 --run-command "rpm -ivh http://rhos-release.virt.bos.redhat.com/repos/rhos-release/rhos-release-latest.noarch.rpm" virt-customize -a overcloud-full.qcow2 --run-command "rhos-release 8-director" ( *** this sets the target node repo files, so you can choose which version/puddle to work with and this should probably be the same as the repo version you set for the undercloud) Ok, as explained here [1] you need to use virt-customize also for changing the password or, if you still need to use for some reason virt-sysprep, you need to pass --selinux-relabel to the command. Then your context will be fine. I'm closing again this bug since... It's not a bug. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1308997 |