Description of problem: A paunch container has three systemd files associated with it: 1. tripleo_*.service - the regular systemd service generated by paunch 2. libpod-conmon*.scope - created dynamically by podman. runs a conmon process that creates a pidfile for tripleo_*.service and monitor it. 3. libpod-*.scope - created dynamically by runc. for cgroups accounting The liveness of the scopes is directly tied to that of the podman container started by tripleo_*.service. Moreover, paunch can only set start/stop dependencies on 1., not 2. and 3. On reboot, systemd is allowed to stop 2. or 3. at any time, which means that it can happen that systemd stops the container's scopes _before_ the tripleo_*.service itself. When such unexpected stop sequence happens, the paunch service can be stopped before all the services it depends on (e.g. nova-compute can be stopped before nova-libvirt), and this can cause restart issue after reboot. There's no option in podman to configure the scope file to not stop before the paunch service is stopped. The only workaround so far is to inject an additional drop-in file for each scope, with extra dependencies that prevents systemd from stopping the scopes file before the paunch service is stopped. Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-10.6.1-0.20190729150510.74ae8ba.el8ost.noarch How reproducible: Depends on systemd shutdown ordering Steps to Reproduce: 1. deploy a overcloud 2. shutdown a compute node Actual results: systemd may stop nova-compute before nova-libvirt while the former depends on the latter Expected results: the ordering should always be respected during shutdown Additional info:
Verified , correct ordering is seen during a shutdown : [root@compute-0 ~]# journalctl -b -1 |grep 'libvirt\|nova'|tail -n 2 Aug 14 11:40:08 compute-0 systemd[1]: Stopped nova_libvirt container. Aug 14 11:40:18 compute-0 systemd[1]: Stopped nova_compute container. [root@compute-0 ~]# rpm -qa|grep paunch paunch-services-4.5.1-0.20190802160541.d105c6e.el8ost.noarch python3-paunch-4.5.1-0.20190802160541.d105c6e.el8ost.noarch [root@compute-0 ~]# logout [heat-admin@compute-0 ~]$ logout Connection to 192.168.24.15 closed. [stack@undercloud-0 ~]$ rpm -qa|grep openstack-tripleo-heat-templates openstack-tripleo-heat-templates-10.6.1-0.20190806190500.bdcffcd.el8ost.noarch
*** Bug 1710871 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811