Bug 1751300
Summary: | neutron-dhcp fail to spawn DHCP process for ctlplane network on the undercloud - container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"write /proc/self/attr/keycreate: permission denied\"" | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Harald Jensås <hjensas> | ||||||
Component: | openstack-selinux | Assignee: | Julie Pichon <jpichon> | ||||||
Status: | CLOSED ERRATA | QA Contact: | nlevinki <nlevinki> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 15.0 (Stein) | CC: | bcafarel, bfournie, dwalsh, elicohen, fiezzi, lhh, lvrabec, m.andre, mlammon, ohochman, pkomarov, sasha, sathlang, scohen, scorcora, zcaplovi | ||||||
Target Milestone: | rc | Keywords: | Regression, Triaged | ||||||
Target Release: | 15.0 (Stein) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | openstack-selinux-0.8.20-0.20190912133707.089066f.el8ost | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2019-09-21 11:24:31 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1751559 | ||||||||
Attachments: |
|
Description
Harald Jensås
2019-09-11 16:39:19 UTC
*** Bug 1750966 has been marked as a duplicate of this bug. *** So nodes provisioning and cleaning will fail due to this issue. To reproduce - simply try to deploy overcloud or to clean nodes in undercloud. Seems like constantly reproduces. Not sure of this libpod error is specific to the neutron-dhcp-agent or to podman. Using: podman-1.0.5-1.gitf604175.module+el8.0.0+4017+bbba319f.x86_64 Including networking DFG in case they have seen this. Created attachment 1614258 [details]
neutron dhcp-agent.log
This may be selinux related, in the logs captured on Bz1750966 I see these in /undercloud-0/var/log/audit/audit.log: type=AVC msg=audit(1568145299.277:2945): avc: denied { create } for pid=77464 comm="runc:[2:INIT]" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=key permissive=0 $ sudo tail -f /var/log/audit/audit.log | grep 'avc: denied' << quiet no errors >> Then run: $ systemctl restart tripleo_neutron_dhcp.service A few seconds later these show up: type=AVC msg=audit(1568239901.789:19443): avc: denied { create } for pid=186828 comm="runc:[2:INIT]" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=key permissive=0 type=AVC msg=audit(1568239903.598:19446): avc: denied { create } for pid=186993 comm="runc:[2:INIT]" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=key permissive=0 type=AVC msg=audit(1568239905.370:19451): avc: denied { create } for pid=187192 comm="runc:[2:INIT]" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=key permissive=0 type=AVC msg=audit(1568239907.174:19452): avc: denied { create } for pid=187343 comm="runc:[2:INIT]" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=key permissive=0 Workaround: $ setenforce permissive && systemctl restart tripleo_neutron_dhcp.service The neutron DHCP server is up: [stack@undercloud-0 ~]$ ps aux | grep dnsmasq | grep neutron root 200345 0.0 0.0 85976 1816 ? Ssl 22:15 0:00 /usr/libexec/podman/conmon -s -c aa908de07fd88b7c37e4d2d9715e94f1ff51706ab62a93631029cffe673e53da -u aa908de07fd88b7c37e4d2d9715e94f1ff51706ab62a93631029cffe673e53da -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/aa908de07fd88b7c37e4d2d9715e94f1ff51706ab62a93631029cffe673e53da/userdata -p /var/run/containers/storage/overlay-containers/aa908de07fd88b7c37e4d2d9715e94f1ff51706ab62a93631029cffe673e53da/userdata/pidfile -l /var/log/containers/stdouts/neutron-dnsmasq-qdhcp-ad57e457-9aaf-4aed-8136-2f9e16583958.log --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg aa908de07fd88b7c37e4d2d9715e94f1ff51706ab62a93631029cffe673e53da --socket-dir-path /var/run/libpod/socket --log-level error root 200358 0.0 0.0 4208 820 ? Ss 22:15 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/host --addn-hosts=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/opts --dhcp-leasefile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=256 --conf-file= --domain=localdomain insights 200373 0.6 0.0 56868 4532 ? S 22:15 0:00 /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/host --addn-hosts=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/opts --dhcp-leasefile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=256 --conf-file= --domain=localdomain it sounds like some labelling issue, and this rule would help: allow spc_t unlabeled_t:key create; Note that the rule is allowed on my laptop, I wonder if we need a new selinux or something. Can you please run: grep AVC /var/log/audit/audit.log | audit2allow -m spc_t > spc_t.te On my laptop, it produce: cat spc_t.t: module spc_t 1.0; require { type spc_t; type unlabeled_t; class key create; } #============= spc_t ============== #!!!! This avc is allowed in the current policy allow spc_t unlabeled_t:key create; Please run: grep AVC /var/log/audit/audit.log | audit2allow -m spc_t And reboot again, see if it helped. Changing component as this looks like selinux issue. (In reply to Emilien Macchi from comment #9) > it sounds like some labelling issue, and this rule would help: > > allow spc_t unlabeled_t:key create; > > Note that the rule is allowed on my laptop, I wonder if we need a new > selinux or something. > Can you please run: > > grep AVC /var/log/audit/audit.log | audit2allow -m spc_t > spc_t.te > [root@undercloud-0 ~]# cat spc_t.te module spc_t 1.0; require { type spc_t; type system_dbusd_t; type container_t; type unlabeled_t; class dbus send_msg; class key create; } #============= container_t ============== allow container_t system_dbusd_t:dbus send_msg; #============= spc_t ============== allow spc_t unlabeled_t:key create; > > Please run: > grep AVC /var/log/audit/audit.log | audit2allow -m spc_t > > And reboot again, see if it helped. [root@undercloud-0 ~]# grep AVC /var/log/audit/audit.log | audit2allow -M spc_t ******************** IMPORTANT *********************** To make this policy package active, execute: semodule -i spc_t.pp [root@undercloud-0 ~]# semodule -i spc_t.pp [root@undercloud-0 ~]# systemctl restart tripleo_neutron_dhcp.service [root@undercloud-0 ~]# ps aux | grep dnsmasq | grep neutron root 45416 0.0 0.0 85976 1828 ? Ssl 23:43 0:00 /usr/libexec/podman/conmon -s -c de74cc776c21899c7226f25e746c8cd93c7685bbb0b1e42d07e514eceb7aef27 -u de74cc776c21899c7226f25e746c8cd93c7685bbb0b1e42d07e514eceb7aef27 -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/de74cc776c21899c7226f25e746c8cd93c7685bbb0b1e42d07e514eceb7aef27/userdata -p /var/run/containers/storage/overlay-containers/de74cc776c21899c7226f25e746c8cd93c7685bbb0b1e42d07e514eceb7aef27/userdata/pidfile -l /var/log/containers/stdouts/neutron-dnsmasq-qdhcp-ad57e457-9aaf-4aed-8136-2f9e16583958.log --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg de74cc776c21899c7226f25e746c8cd93c7685bbb0b1e42d07e514eceb7aef27 --socket-dir-path /var/run/libpod/socket --log-level error root 45428 0.1 0.0 4208 840 ? Ss 23:43 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/host --addn-hosts=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/opts --dhcp-leasefile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=256 --conf-file= --domain=localdomain insights 45443 5.0 0.0 56868 4640 ? S 23:43 0:00 /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/host --addn-hosts=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/opts --dhcp-leasefile=/var/lib/neutron/dhcp/ad57e457-9aaf-4aed-8136-2f9e16583958/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=256 --conf-file= --domain=localdomain It looks like the upstream fix was in container-selinux 2.100 (or maybe 2.109, looking at the tag): https://github.com/containers/container-selinux/commit/3b7818. I'll add the rule from comment 9 to openstack-selinux in the meantime. Would it be possible to get a copy of the audit.log file when the command was run in permissive mode, to make sure we're not missing anything? Thank you. Created attachment 1614449 [details]
Audit log's from reproducer - was running in permissive mode -> then rebooted -> then rules added to policy with audit2allow
The attached file should have it all. I hope it's easy to see when things switch between permissive=0 and permissive=1 in the logs.
Hi, so I've applied that https://bugzilla.redhat.com/show_bug.cgi?id=1751559#c2 on an osp15 env during a "stuck" deployment and then it went on. I still have the env up and running if needed. *** Bug 1751559 has been marked as a duplicate of this bug. *** Thank you for the logs. I can only see the one spc_t / key create denial on the permissive run. The dbus denials appear unrelated, and didn't show up at all during the permissive run. I think PR #43 linked above will be enough to resolve the issue; Sofer is deploying another environment to confirm. hi, so July and I deployed an osp15 undercloud RC-0.9 (with podmain 1.0.5 in it) and saw the overcloud deployment failure. We deleted the overcloud and applied that: [root@undercloud-0 policy]# cat local.te policy_module(local,1.1) gen_require(` type unlabeled_t; type spc_t; ') allow spc_t unlabeled_t:key manage_key_perms; then make -f /usr/share/selinux/devel/Makefile local.pp semodule -i local.pp we didn't restart any service and then retriggered an osp15 overcloud deployment, which went past the deployment of the server. (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | 7dc5e609-6072-4b07-93d0-347f8fa8fef6 | compute-0 | 41d9d5c9-4d56-4278-9862-51a3b5f9a1de | power on | active | False | | 2c398bfb-faf3-4781-8957-c4c9c1d91c4e | controller-0 | 9a694ec5-d31a-4553-97e4-f36e51f188fa | power on | active | False | | 213558e2-1c9e-47ec-ba1a-0c2e3f301932 | controller-1 | 81bf05a0-9589-485c-a2f3-d20928cf39f0 | power on | active | False | | 42445b7b-f53e-4bf4-9cb8-45bdaf963302 | controller-2 | fd3ec193-e863-4419-b899-d94e82fa224a | power on | active | False | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ Thanks, This is fixed by an update to container-selinux. Env: openstack-selinux-0.8.20-0.20190912133707.089066f.el8ost.noarch I re-tested this with latest compose (RHOS_TRUNK-15.0-RHEL-8-20190913.n.3) and no longer see the issue. All nodes deployed Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |