During OSP15 deployment import of instackenv.json hangs even for hours. Looking at failed systemd services, mistral-executor is down: > (undercloud) [stack@undercloud-0 ~]$ systemctl --state=failed > UNIT LOAD ACTIVE SUB DESCRIPTION > ● NetworkManager-wait-online.service loaded failed failed Network Manager Wait Online > ● tripleo_mistral_executor.service loaded failed failed mistral_executor container > ● tripleo_mistral_executor_healthcheck.service loaded failed failed mistral_executor healthcheck > > LOAD = Reflects whether the unit definition was properly loaded. > ACTIVE = The high-level unit activation state, i.e. generalization of SUB. > SUB = The low-level unit activation state, values depend on unit type. > > 3 loaded units listed. Pass --all to see loaded but inactive units, too. > To show all installed unit files use 'systemctl list-unit-files'. From journalctl -u tripleo_mistral_executor.service there is visible: > Apr 10 12:39:50 undercloud-0.redhat.local podman[52332]: INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/mistral/mistral.conf to /etc/mistral/mistral.conf > Apr 10 12:39:50 undercloud-0.redhat.local podman[52332]: INFO:__main__:Deleting /etc/my.cnf.d/tripleo.cnf > Apr 10 12:39:50 undercloud-0.redhat.local podman[52332]: INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/tripleo.cnf to /etc/my.cnf.d/tripleo.cnf > Apr 10 12:39:50 undercloud-0.redhat.local podman[52332]: INFO:__main__:Deleting /var/www/cgi-bin/mistral/app > Apr 10 12:39:50 undercloud-0.redhat.local podman[52332]: INFO:__main__:Copying /var/lib/kolla/config_files/src/var/www/cgi-bin/mistral/app to /var/www/cgi-bin/mistral/app > Apr 10 12:39:50 undercloud-0.redhat.local podman[52332]: ERROR:__main__:MissingRequiredSource: /var/lib/undercloud.conf file is not found > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: tripleo_mistral_executor.service: Main process exited, code=exited, status=1/FAILURE > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: tripleo_mistral_executor.service: Failed with result 'exit-code'. > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: tripleo_mistral_executor.service: Service RestartSec=100ms expired, scheduling restart. > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: tripleo_mistral_executor.service: Scheduled restart job, restart counter is at 29. > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: Stopped mistral_executor container. > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: tripleo_mistral_executor.service: Start request repeated too quickly. > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: tripleo_mistral_executor.service: Failed with result 'exit-code'. > Apr 10 12:39:50 undercloud-0.redhat.local systemd[1]: Failed to start mistral_executor container. > Apr 10 12:43:20 undercloud-0.redhat.local systemd[1]: mistral_executor container is not active. > Apr 10 12:44:50 undercloud-0.redhat.local systemd[1]: mistral_executor container is not active. Inside podman image inspect mistral_executor "config_data": "volumes" i see \"/home/stack/undercloud.conf:/var/lib/undercloud.conf:ro\". Now switching selinux to permissive enables successful start of: > [root@undercloud-0 ~]# systemctl start tripleo_mistral_executor > [root@undercloud-0 ~]# systemctl status tripleo_mistral_executor > ● tripleo_mistral_executor.service - mistral_executor container > Loaded: loaded (/etc/systemd/system/tripleo_mistral_executor.service; enabled; vendor preset: disabled) > Active: active (running) since Wed 2019-04-10 15:13:48 UTC; 3s ago > Process: 227926 ExecStop=/usr/bin/podman stop -t 10 mistral_executor (code=exited, status=0/SUCCESS) > Main PID: 237380 (podman) > Tasks: 14 (limit: 26213) > Memory: 21.5M > CGroup: /system.slice/tripleo_mistral_executor.service > └─237380 /usr/bin/podman start -a mistral_executor From audit.log then: > type=AVC msg=audit(1554909228.545:11915): avc: denied { read } for pid=237729 comm="python" name="undercloud.conf" dev="vda1" ino=75552388 scontext=system_u:system_r:container_t:s0:c349,c825 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > type=AVC msg=audit(1554909228.545:11915): avc: denied { open } for pid=237729 comm="python" path="/var/lib/undercloud.conf" dev="vda1" ino=75552388 scontext=system_u:system_r:container_t:s0:c349,c825 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > type=AVC msg=audit(1554909228.545:11916): avc: denied { ioctl } for pid=237729 comm="python" path="/var/lib/undercloud.conf" dev="vda1" ino=75552388 ioctlcmd=0x5401 scontext=system_u:system_r:container_t:s0:c349,c825 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > type=AVC msg=audit(1554909228.545:11917): avc: denied { relabelto } for pid=237729 comm="python" name="undercloud.conf" dev="vda1" ino=10041677 scontext=system_u:system_r:container_t:s0:c349,c825 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > type=AVC msg=audit(1554909228.545:11918): avc: denied { setattr } for pid=237729 comm="python" name="undercloud.conf" dev="vda1" ino=10041677 scontext=system_u:system_r:container_t:s0:c349,c825 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 Some of packages/images involved: > container-selinux-2.75-1.git99e2cfd.module+el8+2769+577ad176.noarch > libselinux-2.8-6.el8.x86_64 > libselinux-ruby-2.8-6.el8.x86_64 > libselinux-utils-2.8-6.el8.x86_64 > openstack-selinux-0.8.18-0.20190329040328.4c5ed0f.el8ost.noarch > openvswitch-selinux-extra-policy-1.0-10.el8fdb.noarch > python3-libselinux-2.8-6.el8.x86_64 > rpm-plugin-selinux-4.14.2-9.el8.x86_64 > selinux-policy-3.14.1-61.el8.noarch > selinux-policy-targeted-3.14.1-61.el8.noarch > ansible-role-tripleo-modify-image-1.0.1-0.20190402220346.012209a.el8ost.noarch > ansible-tripleo-ipsec-9.0.1-0.20190220162047.f60ad6c.el8ost.noarch > openstack-tripleo-common-10.6.2-0.20190408160359.d8fded9.el8ost.noarch > openstack-tripleo-common-containers-10.6.2-0.20190408160359.d8fded9.el8ost.noarch > openstack-tripleo-heat-templates-10.4.1-0.20190409050352.58ff7df.el8ost.noarch > openstack-tripleo-image-elements-10.3.1-0.20190325204940.253fe88.el8ost.noarch > openstack-tripleo-puppet-elements-10.2.1-0.20190408131411.a72c6b3.el8ost.noarch > openstack-tripleo-validations-10.3.1-0.20190404130349.6ecfb48.el8ost.noarch > puppet-mistral-14.4.1-0.20190328231250.f9e938d.el8ost.noarch > puppet-tripleo-10.3.1-0.20190405000342.566703d.el8ost.noarch > python3-mistral-lib-1.1.0-0.20190312192103.bac92db.el8ost.noarch > python3-mistralclient-3.8.1-0.20190318115402.0cd6b28.el8ost.noarch > python3-tripleo-common-10.6.2-0.20190408160359.d8fded9.el8ost.noarch > python3-tripleoclient-11.3.1-0.20190409084327.be9b7ef.el8ost.noarch > python3-tripleoclient-heat-installer-11.3.1-0.20190409084327.be9b7ef.el8ost.noarch > > [root@undercloud-0 ~]# podman images|grep -i mistr > 192.168.24.1:8787/rhosp15/openstack-mistral-event-engine 20190409.1 250f01a40e46 18 hours ago 989 MB > 192.168.24.1:8787/rhosp15/openstack-mistral-api 20190409.1 d063f718d5e5 18 hours ago 1.01 GB > 192.168.24.1:8787/rhosp15/openstack-mistral-engine 20190409.1 a78d46f6594b 19 hours ago 989 MB > 192.168.24.1:8787/rhosp15/openstack-mistral-executor 20190409.1 68c1f09c2bfa 19 hours ago 1.23 GB
I thought perhaps the undercloud.conf file is keeping the original home directory context permissions when mounted, when it should switch to a mistral or container-specific context. However, AlistairT ran ls -Z in an environment where this is working in enforcing mode and it appears like this should work even with the unconfined user_home_t context: podman exec -u root mistral_executor bash ls -lZ /var/lib/undercloud.conf -rwxr-xr-x. 1 1001 1001 unconfined_u:object_r:user_home_t:s0 891 Apr 4 10:54 /var/lib/undercloud.conf Investigating further. I doubt we want to give containers read access to the home directory in general.
A) Enforcing on # 68c1f09c2bfa is the mistral image podman run -it --rm -user=root --net=host -e KOLLA_INSTALL_METATYPE=rhos -e KOLLA_INSTALL_TYPE=binary \ -e KOLLA_BASE_DISTRO=rhel -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS -e KOLLA_DISTRO_PYTHON_VERSION=3.6 \ -v /home/stack/undercloud.conf:/var/lib/undercloud.conf \ -v /var/lib/kolla/config_files/mistral_executor.json:/var/lib/kolla/config_files/config.json \ -v /var/lib/config-data/puppet-generated/mistral/:/var/lib/kolla/config_files/src 68c1f09c2bfa sh [root@undercloud-0 ~]# sh x.sh ()[root@undercloud-0 /]$ kolla_set_configs INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json ....snip.... INFO:__main__:Copying /var/lib/kolla/config_files/src/var/www/cgi-bin/mistral/app to /var/www/cgi-bin/mistral/app ERROR:__main__:MissingRequiredSource: /var/lib/undercloud.conf file is not found The error is a bit misleading because the file is actually there: ()[root@undercloud-0 /]$ ls -1 /var/lib/ |grep -i undercloud.conf undercloud.conf The problem is that we cannot access it: ()[root@undercloud-0 /]$ ls -lZ /var/lib/undercloud.conf ls: cannot access '/var/lib/undercloud.conf': Permission denied [root@undercloud-0 ~]# ls -ldZ /home/stack/ ; ls -lZ /home/stack/undercloud.conf drwx------. 9 stack stack unconfined_u:object_r:user_home_dir_t:s0 4096 Apr 10 11:06 /home/stack/ -rwxr-xr-x. 1 stack stack unconfined_u:object_r:user_home_t:s0 891 Apr 10 10:23 /home/stack/undercloud.conf The denied I see around this are all about dbus and sudo (which is likely https://bugs.launchpad.net/tripleo/+bug/1819461) so not sure they are relevant (?): type=AVC msg=audit(1554910473.290:10321): avc: denied { connectto } for pid=130205 comm="sudo" path="/run/dbus/system_bus_socket" scontext=system_u:system_r:container_t:s0:c363,c968 tcontext=system_u:system_r:system_dbusd_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0 type=AVC msg=audit(1554910473.293:10325): avc: denied { connectto } for pid=130205 comm="sudo" path="/run/dbus/system_bus_socket" scontext=system_u:system_r:container_t:s0:c363,c968 tcontext=system_u:system_r:system_dbusd_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0 B) Enforcing off podman run -it --user=root --rm --net=host -e KOLLA_INSTALL_METATYPE=rhos -e KOLLA_INSTALL_TYPE=binary \ -e KOLLA_BASE_DISTRO=rhel -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS -e KOLLA_DISTRO_PYTHON_VERSION=3.6 \ -v /home/stack/undercloud.conf:/var/lib/undercloud.conf \ -v /var/lib/kolla/config_files/mistral_executor.json:/var/lib/kolla/config_files/config.json \ -v /var/lib/config-data/puppet-generated/mistral/:/var/lib/kolla/config_files/src 68c1f09c2bfa sh ()[root@undercloud-0 /]$ ls -lZ /var/lib/undercloud.conf -rwxr-xr-x. 1 1001 1001 unconfined_u:object_r:user_home_t:s0 891 Apr 10 14:23 /var/lib/undercloud.conf Seems to work correctly What I do not fully understand is why we do not see any denials specific to these ls -l commands run as root user (without the sudo)?
Julie++'s suggestion seems to have worked: 1) Edited /var/lib/tripleo-config/hashed-container-startup-config-step_4.json and added ',z' to the mistral_executor stanza for the undercloud.conf file only: "/home/stack/undercloud.conf:/var/lib/undercloud.conf:ro,z", 2) Ran paunch again for step4 paunch --debug apply --default-runtime podman --file /var/lib/tripleo-config/hashed-container-startup-config-step_4.json --config-id tripleo_step4 --managed-by tripleo-Undercloud 2>&1 | tee /tmp/paunch.log 3) mistral_executor is up and running: [root@undercloud-0 ~]# podman ps |grep mistral_ex e220c54e027d 192.168.24.1:8787/rhosp15/openstack-mistral-executor:20190409.1 dumb-init --singl... About a minute ago Up About a minute ago mistral_executor [root@undercloud-0 ~]# podman logs mistral_executor 2>&1|tail -n10 + . kolla_extend_start ++ [[ ! -d /var/log/kolla/mistral ]] ++ mkdir -p /var/log/kolla/mistral +++ stat -c %a /var/log/kolla/mistral ++ [[ 2755 != \7\5\5 ]] ++ chmod 755 /var/log/kolla/mistral Running command: '/usr/bin/mistral-server --config-file=/etc/mistral/mistral.conf --log-file=/var/log/mistral/executor.log --server=executor' ++ . /usr/local/bin/kolla_mistral_extend_start + echo 'Running command: '\''/usr/bin/mistral-server --config-file=/etc/mistral/mistral.conf --log-file=/var/log/mistral/executor.log --server=executor'\''' + exec /usr/bin/mistral-server --config-file=/etc/mistral/mistral.conf --log-file=/var/log/mistral/executor.log --server=executor
*** Bug 1698540 has been marked as a duplicate of this bug. ***
Node import passed successfully, no mistral in list of failed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811