Description of problem: I tried to deploy minimal(1 controller + 1 compute) overcloud OSP12 with containerized openstack services and received failure. http://pastebin.test.redhat.com/481605 /bin/bash: /var/lib/docker-puppet/docker-puppet-gnocchi.sh: Permission denied Failed running docker-puppet.py for gnocchi Removing container: docker-puppet-gnocchi docker-puppet-gnocchi Version-Release number of selected component (if applicable): OSP12 How reproducible: always Steps to Reproduce: 1.Execute steps from http://etherpad.corp.redhat.com/testing-osp12-containers 2. 3. Actual results: Heat Stack create failed. Expected results: Heat Stack created Additional info: http://pastebin.test.redhat.com/481605
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
We found it is caused by SElinux being enabled on the docker hosts: [heat-admin@overcloud-controller-0 ~]$ sudo getenforce Enforcing [heat-admin@overcloud-controller-0 ~]$ sudo /usr/bin/docker run --user root --name test --rm -ti --volume /tmp/:/tmp/:rw centos bash [root@5d0ccad4557a /]# ls /tmp/ ls: cannot open directory /tmp/: Permission denied [heat-admin@overcloud-controller-0 ~]$ sudo setenforce permissive [heat-admin@overcloud-controller-0 ~]$ sudo getenforce Permissive [heat-admin@overcloud-controller-0 ~]$ sudo /usr/bin/docker run --user root --name test --rm -ti --volume /tmp/:/tmp/:rw centos bash [root@aeb001d9df2e /]# ls /tmp/ builder.log systemd-private-d16d448352f541c5b3222efa04f443d6-ntpd.service-wM9ape systemd-private-d16d448352f541c5b3222efa04f443d6-systemd-machined.service-l66NlN tmp.DL5p9bjVsn tmpM8xnqx
(In reply to Martin André from comment #2) > We found it is caused by SElinux being enabled on the docker hosts: > > [heat-admin@overcloud-controller-0 ~]$ sudo getenforce > Enforcing > [heat-admin@overcloud-controller-0 ~]$ sudo /usr/bin/docker run --user root > --name test --rm -ti --volume /tmp/:/tmp/:rw centos bash > > [root@5d0ccad4557a /]# ls /tmp/ > ls: cannot open directory /tmp/: Permission denied > > [heat-admin@overcloud-controller-0 ~]$ sudo setenforce permissive > [heat-admin@overcloud-controller-0 ~]$ sudo getenforce > > Permissive > [heat-admin@overcloud-controller-0 ~]$ sudo /usr/bin/docker run --user root > --name test --rm -ti --volume /tmp/:/tmp/:rw centos bash > [root@aeb001d9df2e /]# ls /tmp/ > builder.log > systemd-private-d16d448352f541c5b3222efa04f443d6-ntpd.service-wM9ape > systemd-private-d16d448352f541c5b3222efa04f443d6-systemd-machined.service- > l66NlN tmp.DL5p9bjVsn tmpM8xnqx Hi Martin, what are next steps for this bug? This bug blocks QE team for future deployment. Thx
(In reply to Martin André from comment #2) > We found it is caused by SElinux being enabled on the docker hosts: > > [heat-admin@overcloud-controller-0 ~]$ sudo getenforce > Enforcing > [heat-admin@overcloud-controller-0 ~]$ sudo /usr/bin/docker run --user root > --name test --rm -ti --volume /tmp/:/tmp/:rw centos bash > > [root@5d0ccad4557a /]# ls /tmp/ > ls: cannot open directory /tmp/: Permission denied > > [heat-admin@overcloud-controller-0 ~]$ sudo setenforce permissive > [heat-admin@overcloud-controller-0 ~]$ sudo getenforce > > Permissive > [heat-admin@overcloud-controller-0 ~]$ sudo /usr/bin/docker run --user root > --name test --rm -ti --volume /tmp/:/tmp/:rw centos bash > [root@aeb001d9df2e /]# ls /tmp/ > builder.log > systemd-private-d16d448352f541c5b3222efa04f443d6-ntpd.service-wM9ape > systemd-private-d16d448352f541c5b3222efa04f443d6-systemd-machined.service- > l66NlN tmp.DL5p9bjVsn tmpM8xnqx can you grab the audit.log from the container after you set to permissive and attach it to this bz please?
Created attachment 1277630 [details] audit.log Audit logs from after setting enforce to permissive. Please note I've run the docker-puppet.py script from the home directory of heat-admin user while is it usually run by heat.
What RPM does /var/lib/docker-puppet come from (rpm -qf /var/lib/docker-puppet) ?
tripleo-heat-templates. Note also that docker-puppet.py creates scripts that configure the system, then presumably executes them. Once the deployment is complete, is the docker-puppet.py script removed, or does it remain?
https://github.com/openstack/tripleo-heat-templates/blob/master/docker/docker-puppet.py#L158
http://logs.openstack.org/26/460126/4/check-tripleo/gate-tripleo-ci-centos-7-ovb-containers-oooq-nv/a8a8110/logs/oooq/overcloud-controller-0/var/log/audit/audit.log.txt.gz
With this workaround that provided by Martin - http://paste.openstack.org/show/609370/ deployment failed in another place - https://bugzilla.redhat.com/show_bug.cgi?id=1450370#c0
I cannot reproduce this behavior with CentOS 7.3.1611, where I get a successful deployment even when setting selinux to enforced. I get selinux related issues later on when trying to use the provisioned overcloud though. Can it be that we're missing a package on RHEL? For sake of completeness, here is some more info about the environment: http://paste.openstack.org/show/610892/
Reproduced today on --image-url http://download-node-02.eng.bos.redhat.com/brewroot/packages/rhel-guest-image/7.4/135/images/rhel-guest-image-7.4-135.x86_64.qcow2 overcloud.AllNodesDeploySteps.ControllerDockerConfigJsonStartupDataDeployment: resource_type: OS::Heat::SoftwareDeploymentGroup physical_resource_id: ed74ac22-5de2-43ba-a2ab-d25301861983 status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.ControllerHostPrepDeployment: resource_type: OS::Heat::SoftwareDeploymentGroup physical_resource_id: 1145db4d-6189-415a-a318-11177d35451f status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.ComputeGenerateConfigDeployment.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: 8117ac87-e268-4fdd-9e90-926be6ca6562 status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 deploy_stdout: | ... 2017-05-30 09:11:42,727 DEBUG: NET_HOST enabled 2017-05-30 09:11:42,727 DEBUG: Running docker command: /usr/bin/docker run --user root --name docker-puppet-nova_libvirt --env PUPPET_TAGS=file,file_line,concat,augeas,nova_config,nova_paste_api_ini,nova_config --env NAME=nova_libvirt --env HOSTNAME=overcloud-compute-0 --env NO_ARCHIVE= --env STEP=6 --volume /tmp/tmp7HMxuA:/etc/config.pp:ro --volume /etc/puppet/:/tmp/puppet-etc/:ro --volume /usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro --volume /var/lib/config-data/:/var/lib/config-data/:rw --volume tripleo_logs:/var/log/tripleo/ --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume /var/lib/docker-puppet/docker-puppet.sh:/var/lib/docker-puppet/docker-puppet.sh:rw --entrypoint /var/lib/docker-puppet/docker-puppet.sh --net host --volume /etc/hosts:/etc/hosts:ro 192.168.24.1:8787/rhosp12/openstack-nova-compute-docker:2017-05-16.6 2017-05-30 09:11:43,141 DEBUG: /bin/bash: /var/lib/docker-puppet/docker-puppet.sh: Permission denied 2017-05-30 09:11:43,141 ERROR: Failed running docker-puppet.py for nova_libvirt 2017-05-30 09:11:43,141 INFO: Removing container: docker-puppet-nova_libvirt 2017-05-30 09:11:43,722 DEBUG: docker-puppet-nova_libvirt 2017-05-30 09:11:43,723 ERROR: ERROR configuring neutron 2017-05-30 09:11:43,723 ERROR: ERROR configuring nova_libvirt (truncated, view all with --long) deploy_stderr: | Heat Stack create failed. Heat Stack create failed.
Adding the content of fpaste from comment #10 to ensure it's available: diff --git i/docker/firstboot/setup_docker_host.sh w/docker/firstboot/setup_docker_host.sh index 8b4c6a0..7a7bc3b 100755 --- i/docker/firstboot/setup_docker_host.sh +++ w/docker/firstboot/setup_docker_host.sh @@ -6,3 +6,5 @@ set -eux # Disable libvirtd since it conflicts with nova_libvirt container /usr/bin/systemctl disable libvirtd.service /usr/bin/systemctl stop libvirtd.service + +setenforce permissive diff --git i/environments/docker.yaml w/environments/docker.yaml index 0c6028d..a073565 100644 --- i/environments/docker.yaml +++ w/environments/docker.yaml @@ -4,7 +4,7 @@ resource_registry: # OS::TripleO::NodeUserData: ../docker/firstboot/setup_docker_host.yaml OS::TripleO::Services::Docker: ../puppet/services/docker.yaml # The compute node still needs extra initialization steps - OS::TripleO::Compute::NodeUserData: ../docker/firstboot/setup_docker_host.yaml + OS::TripleO::NodeUserData: ../docker/firstboot/setup_docker_host.yaml #NOTE (dprince) add roles to be docker enabled as we support them OS::TripleO::Services::NovaLibvirt: ../docker/services/nova-libvirt.yaml
Lon can you peek to see if this is an selinux issue that you have enough info in here to address?
I would suspect that the function that calls docker: https://github.com/openstack/tripleo-heat-templates/blob/master/docker/docker-puppet.py ... needs this: --security-opt="label=disable" : Turn off label confinement for the container ... from: https://docs.docker.com/engine/reference/run/ But, only during the initial docker run to configure things. After that, I would hope it works.
Basically, Dan said that using that option causes things to run unconfined during that docker run. My initial assumption is that we'd want to disable SELinux while doing the deployment due to the fact that we were generating config scripts and then configuring containers using them. I assume we'd also want to run 'restorecon -Rv /' inside the container at the end of the generated script to ensure anything we did during that Docker run leaves things with the right label after we were done.
Created attachment 1292059 [details] Test patch
Artem - can you please check the test-patch by Lon ?
Created attachment 1296154 [details] openstack stack failures list --long after deployment with patch
Lon, the provided patch fails because restorecon is not available in the containers. Do you think it make sense to add it to the containers? Artem, can you retest the patch without the restorecon part?
Stack deployed with patch without restorecon part http://pastebin.test.redhat.com/502951
If you want SELinux operation in containers and do post-build tasks that include software configuration and deployment, I would suspect the things that fix file contexts and open ports: semanage restorecon ... need to exist or at least be available. I'd think some base container layer that we depend on should have it.
Hmm, or we're doing it wrong. More research required. Stay tuned.
Yurii - Can you confirm that you had a successful osp12 overcloud deployment with SElinux Enabled ? bug is Still "New" status , so we need to know if we should attempt a re-test ..
(In reply to Omri Hochman from comment #25) > Yurii - Can you confirm that you had a successful osp12 overcloud > deployment with SElinux Enabled ? > > bug is Still "New" status , so we need to know if we should attempt a > re-test .. Artem : can you please recheck if bug is still relevant?
sure, I will check it
So, I successfully deployed overcloud with enabled selinux on oc nodes without any w/a [heat-admin@overcloud-compute-0 ~]$ sestatus SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Max kernel policy version: 28
Deployment with selinux enabled completes successfully. Subsequent issue with selinux on deployed OSP12 is reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1477720
Doesn't seem to reproduce any more after talking with QE.