Bug 1726442
Summary: | SIGTERM from systemd to containers|conmon on shutdown causes unexpected results | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Damien Ciabrini <dciabrin> |
Component: | podman | Assignee: | Jindrich Novy <jnovy> |
Status: | CLOSED ERRATA | QA Contact: | Alex Jia <ajia> |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.3 | CC: | bbaude, dciabrin, dornelas, dwalsh, emacchi, gscrivan, jligon, jnovy, lsm5, mheon, michele, msekleta, pthomas, tsweeney, vrothber, ypu |
Target Milestone: | rc | Keywords: | Reopened, Triaged |
Target Release: | 8.4 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | podman-3.0.0-2.el8 or newer | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-05-18 15:32:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1727325 |
Comment 1
Damien Ciabrini
2019-07-02 21:04:01 UTC
Potential partial solution at https://github.com/containers/libpod/pull/3474 (unverified at present) So I'm trying to come up with a small reproducer that mimick the way we set up our containers in OpenStack. I'm not convinced I got a valid reproducer yet, but this is what I have right now: ### create a container that mimicks how our cluster manager manages podman container (i.e. without systemd) podman create --name=service_a -d --net=host fedora sleep infinity ### create another container that mimicks how the other containers (the majority of Openstack containers) are spawned and monitored by systemd podman create --name=service_b --conmon-pidfile=/var/run/service_b.pid -d --net=host fedora sleep infinity ### create the systemd services to mimick what is done in OpenStack ### - service_a is our cluster manager, which creates podman container without systemd nor service files ### - service_b is a regular openstack podman container, managed by systemd ### - mid_service is a dummy service that serves as a synchronization point to ensure that openstack services (here service_b) always stop before container managed by our cluster manager (here service_a) cd /etc/systemd/system cat >service_a.service <<'EOF' [Unit] Description=service A [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/bin/podman start service_a ExecStop=/usr/bin/podman stop -t 10 service_a [Install] WantedBy=multi-user.target EOF cat >mid_service.service <<'EOF' [Unit] Description=Mid-service time checkpoint After=service_a.service Before=shutdown.target RefuseManualStop=yes [Service] Type=oneshot ExecStart=/bin/true RemainAfterExit=yes ExecStop=/bin/true [Install] WantedBy=multi-user.target EOF cat >service_a.service <<'EOF' [Unit] Description=service B After=mid_service.service [Service] Restart=always ExecStart=/usr/bin/podman start service_b ExecStop=/bin/sh -c "sleep 10 && echo 'arbitrary long sleep before stop' && /usr/bin/podman stop -t 10 service_b" KillMode=none Type=forking PIDFile=/var/run/service_b.pid [Install] WantedBy=multi-user.target EOF systemctl daemon-reload ### observe that service B starts after service A systemctl enable service_a service_b mid_service --now ### observe that when stopping service B, there's a configured 10s delay before container effectively stops systemctl stop service_b Jul 03 13:44:42 controller-0 systemd[1]: Stopping service B... Jul 03 13:44:52 controller-0 sh[101465]: arbitrary long sleep before stop Jul 03 13:45:03 controller-0 systemd[1]: libpod-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope: Consumed 24ms CPU time Jul 03 13:45:03 controller-0 sh[101465]: 5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da Jul 03 13:45:03 controller-0 systemd[1]: Stopped service B. # restart service_b before last test # systemctl stop service_b # observe than when rebooting, conmon scope for service A can be stopped by systemd even though service B hasn't fully stop yet [root@controller-0 ~]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [...a few containers in my env...] 5dcf0c985fca docker.io/library/fedora:latest sleep infinity 2 hours ago Up 59 seconds ago service_b 16090b176250 docker.io/library/fedora:latest sleep infinity 2 hours ago Up About an hour ago service_a [...other contaienrs in my env...] [root@controller-0 ~]# systemctl -a | grep -e 5dcf0c985fca -e 16090b176250 var-lib-containers-storage-overlay\x2dcontainers-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9-userdata-shm.mount loaded active mounted /var/lib/containers/storage/overlay-containers/16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9/userdata/shm var-lib-containers-storage-overlay\x2dcontainers-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da-userdata-shm.mount loaded active mounted /var/lib/containers/storage/overlay-containers/5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da/userdata/shm libpod-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9.scope loaded active running libcontainer container 16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9 libpod-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope loaded active running libcontainer container 5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da libpod-conmon-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9.scope loaded active running libpod-conmon-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9.scope libpod-conmon-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope loaded active running libpod-conmon-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope reboot # after reboot, show the sequence and observe that container 16090b176250 spawned by service a got stopped by systemd before stopping service B. [root@controller-0 ~]# journalctl --since 12:59:56 -t systemd -t sh | grep -e service -e 5dcf0c985fca -e 16090b176250 -e eboot Jul 03 12:59:56 controller-0 systemd[1]: local-fs.target: Found dependency on systemd-tmpfiles-setup.service/stop Jul 03 12:59:56 controller-0 systemd[1]: Stopping libpod-conmon-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9.scope. Jul 03 12:59:56 controller-0 systemd[1]: Stopping libcontainer container 16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9. Jul 03 12:59:56 controller-0 systemd[1]: Stopping libpod-conmon-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope. Jul 03 12:59:56 controller-0 systemd[1]: Stopping service B... Jul 03 12:59:56 controller-0 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Jul 03 12:59:57 controller-0 systemd[1]: Stopping libcontainer container 5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da. Jul 03 12:59:57 controller-0 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Jul 03 12:59:57 controller-0 systemd[1]: dnf-makecache.service: Main process exited, code=killed, status=15/TERM Jul 03 12:59:57 controller-0 systemd[1]: dnf-makecache.service: Failed with result 'signal'. Jul 03 12:59:57 controller-0 systemd[1]: Starting Show Plymouth Reboot Screen... Jul 03 12:59:57 controller-0 systemd[1]: Started Show Plymouth Reboot Screen. Jul 03 12:59:57 controller-0 systemd[1]: Stopped target NFS client services. Jul 03 13:00:17 controller-0 systemd[1]: Stopped libcontainer container 5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da. Jul 03 13:00:17 controller-0 systemd[1]: libpod-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope: Consumed 27ms CPU time Jul 03 13:00:17 controller-0 systemd[1]: Unmounted /var/lib/containers/storage/overlay-containers/5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da/userdata/shm. Jul 03 13:00:17 controller-0 sh[27825]: 5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da Jul 03 13:00:17 controller-0 systemd[1]: Stopped service B. Jul 03 13:00:17 controller-0 systemd[1]: Stopping Mid-service time checkpoint... Jul 03 13:00:17 controller-0 systemd[1]: Stopped libpod-conmon-5dcf0c985fcac4558bd6290f3ef6fecbd0e40a0e2ca67a53839ce57c5069b3da.scope. Jul 03 13:00:17 controller-0 systemd[1]: Stopped Mid-service time checkpoint. Jul 03 13:00:17 controller-0 systemd[1]: Stopping service A... Jul 03 13:00:27 controller-0 systemd[1]: Stopped libcontainer container 16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9. Jul 03 13:00:27 controller-0 systemd[1]: libpod-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9.scope: Consumed 32ms CPU time Jul 03 13:00:27 controller-0 systemd[1]: Unmounted /var/lib/containers/storage/overlay-containers/16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9/userdata/shm. Jul 03 13:00:27 controller-0 systemd[1]: Stopped service A. Jul 03 13:00:27 controller-0 systemd[1]: Stopped libpod-conmon-16090b176250e8e9d12d41ac92773d17a3011d1d5da5a7b4466d651b705c30d9.scope. Jul 03 13:00:27 controller-0 systemd[1]: Starting Reboot... -- Reboot -- It looks like systemd doesn't care about orderning and begin to stop conmon for podman containers service_a and service_b even if the systemd service for service_b hasn't stopped yet. I don't know how valid that reproducer is because ultimately I still see the log "Stopped libcontainer container 16090b1..." after the log "Stopped service B." I'd still like to get confirm/disprove this entire theory that conmon sends SIGTERM to our containers at an unexpected time during reboot, by getting a unambiguous reboot log sequence. But this is essentially the sequence of events that is happening in our OpenStack env: # before reboot, rabbitmq monitor is ok Jul 03 14:01:43 controller-0 rabbitmq-cluster(rabbitmq)[138382]: DEBUG: rabbitmq monitor : # a reboot is started, and conmon for rabbitmq container begins to stop Jul 03 14:01:51 controller-0 systemd[1]: Stopping libpod-conmon-85b5a880b211a9fd0346166d30383c6fd8f2ad5b5ee0a20970f0d1158be26e43.scope. # the main pid in the rabbitmq container detects that it got requested to terminate Jul 03 14:01:51 controller-0 pacemaker-remoted[3954]: notice: Caught 'Terminated' signal # only now did the regular systemd-managed openstack service stops. (i.e. rabbitmq shouldn't stop before horizon per systemd dependencies) Jul 03 14:01:51 controller-0 systemd[1]: Stopping horizon container... Jul 03 14:01:51 controller-0 pacemaker-controld[2623]: notice: rabbitmq-bundle-0 requested shutdown of its remote connection Jul 03 14:02:01 controller-0 systemd[1]: Stopped horizon container. Jul 03 14:02:01 controller-0 systemd[1]: Stopping Paunch Container Shutdown... Jul 03 14:02:01 controller-0 systemd[1]: Stopped Paunch Container Shutdown. Jul 03 14:02:01 controller-0 pacemakerd[2596]: notice: Caught 'Terminated' signal # only now our cluster manager begins stopped Jul 03 14:02:01 controller-0 systemd[1]: Stopping Pacemaker High Availability Cluster Manager... Jul 03 14:02:01 controller-0 pacemakerd[2596]: notice: Shutting down Pacemaker Jul 03 14:02:01 controller-0 pacemakerd[2596]: notice: Stopping pacemaker-controld Jul 03 14:02:01 controller-0 pacemaker-controld[2623]: notice: Caught 'Terminated' signal [...] # and apparently the rabbitmq process in the rabbitmq container already got stopped Jul 03 14:02:16 controller-0 rabbitmq-cluster(rabbitmq)[140011]: INFO: RabbitMQ server is not running Jul 03 14:02:16 controller-0 rabbitmq-cluster(rabbitmq)[140016]: DEBUG: rabbitmq stop : 0 Jul 03 14:02:16 controller-0 pacemaker-remoted[3954]: notice: rabbitmq_stop_0:87492:stderr [ Error: unable to perform an operation on node 'rabbit@controller-0'. Please see diagnostics information and suggestions below. ] # only that the rabbitmq container gets stopped Jul 03 14:02:16 controller-0 systemd[1]: Stopped libcontainer container 85b5a880b211a9fd0346166d30383c6fd8f2ad5b5ee0a20970f0d1158be26e43. Jul 03 14:02:16 controller-0 systemd[1]: libpod-85b5a880b211a9fd0346166d30383c6fd8f2ad5b5ee0a20970f0d1158be26e43.scope: Consumed 9min 41.625s CPU time Jul 03 14:02:16 controller-0 systemd[1]: Unmounted /var/lib/containers/storage/overlay-containers/85b5a880b211a9fd0346166d30383c6fd8f2ad5b5ee0a20970f0d1158be26e43/userdata/shm. Jul 03 14:02:16 controller-0 systemd[1]: Unmounted /var/lib/containers/storage/overlay/41beba6aded9ddfd4d89570f2831d2349b784483904fc54983756e2f7627389a/merged. Jul 03 14:02:16 controller-0 systemd[1]: Stopped libpod-conmon-85b5a880b211a9fd0346166d30383c6fd8f2ad5b5ee0a20970f0d1158be26e43.scope. I'ved tested Damien's solution, did a reboot and here are the logs: http://ix.io/1OKd We can see that both rabbitmq & galera are stopped *after* non-ha containers, I think this is a viable option given the fact we have no other alternative at this time. FTR, until we have a programmatic way to configure those dependencies in podman, here are the two workarounds that we've implemented for OpenStack: 1. cluster-managed podman containers: https://bugzilla.redhat.com/show_bug.cgi?id=1738303 2. systemd-managed podman containers: https://bugzilla.redhat.com/show_bug.cgi?id=1737036 Moving this out to RHEL8.2 since it has a lower priority and will not be fixed for the 8.1 release. Lets push this to 8.3 release. Matt and Valentin has this been fixed i podman 3.0? I believe Giuseppe's cgroups=split patch may have provided a resolution to systemd shutdown ordering, but I'm insufficiently familiar with systemd's shutdown process to be sure, and I've never tested this. I think we need some expertise from the systemd team. Pulling in, Michal. @Michal: I will summarize the issue quickly and describe what I am seeing in my local reproducers. Assume we have two units (A and B). Both are generated via `podman generate systemd`. Unit B is set "After=A.service". When I stop the units, B is stopped before A. The journal clearly indicates that `Stopping A` happens after `Stopped B`. Now when I reboot the machine (systemctl reboot) the stop order changes. A and B are stopped simultaneously. As suggested above, adding a `sleep` before `podman stop` makes that easier to see. Can you give guidance on how to enforce the ordering at shutdown/reboot? After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. I am going to reopen since this bug is serious. Michal, Giuseppe, Dan and I had a debugging session on Friday. We looked specifically at the shutdown scenario where the Order of among the Podman services was seemingly changed. The problem lies in systemd scope that Podman creates and runs the container in. Systemd does not know that the scope relates to the unit Podman runs in. Hence, as soon as the shutdown starts, such "orphaned" scopes are being cleaned up by systemd. This means that the container is being killed but note that conmon is not involved. Once the services are about to stop, conmon fails immediately since the container has already been killed. We want to explore two solutions: 1) Create a split-mode for Cgroups v1. 2) Let Podman send dbus messages to inform systemd about the scope. Another simple workaround is to use `podman create/run --cgroups=disabled`. This way, the container runs in the unit's Cgroup. (In reply to Damien Ciabrini from comment #6) > So I'm trying to come up with a small reproducer that mimick the way we set > up our containers in OpenStack. > I'm not convinced I got a valid reproducer yet, but this is what I have > right now: Hi Damien, I tried to use your reproducer to verify this bug on podman-3.0.1-1.module+el8.4.0+10073+30e5ea69 w/ crun-0.18-1.module+el8.4.0+10073+30e5ea69, the following is my test output, please help confirm whether it is enough for you, thanks! [root@ibm-x3650m4-01-vm-02 system]# systemctl daemon-reload [root@ibm-x3650m4-01-vm-02 system]# systemctl enable service_a service_b mid_service --now Created symlink /etc/systemd/system/multi-user.target.wants/service_a.service → /etc/systemd/system/service_a.service. Created symlink /etc/systemd/system/multi-user.target.wants/service_b.service → /etc/systemd/system/service_b.service. Created symlink /etc/systemd/system/multi-user.target.wants/mid_service.service → /etc/systemd/system/mid_service.service. [root@ibm-x3650m4-01-vm-02 system]# systemctl stop service_b NOTE: there's a configured 10s delay in here. [root@ibm-x3650m4-01-vm-02 system]# systemctl restart service_b [root@ibm-x3650m4-01-vm-02 system]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f9ef0d2a2255 registry.fedoraproject.org/fedora:latest sleep infinity 6 minutes ago Up 58 seconds ago service_a c13ade2a0a18 registry.fedoraproject.org/fedora:latest sleep infinity 6 minutes ago Up 10 seconds ago service_b [root@ibm-x3650m4-01-vm-02 system]# systemctl -a | grep -e f9ef0d2a2255 -e c13ade2a0a18 var-lib-containers-storage-overlay\x2dcontainers-c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6-userdata-shm.mount loaded active mounted /var/lib/containers/storage/overlay-containers/c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6/userdata/shm var-lib-containers-storage-overlay\x2dcontainers-f9ef0d2a22558b1d8d667043ab648f6e8dc22396d513fd72d34c8f92f8303327-userdata-shm.mount loaded active mounted /var/lib/containers/storage/overlay-containers/f9ef0d2a22558b1d8d667043ab648f6e8dc22396d513fd72d34c8f92f8303327/userdata/shm libpod-c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6.scope loaded active running libcrun container libpod-f9ef0d2a22558b1d8d667043ab648f6e8dc22396d513fd72d34c8f92f8303327.scope loaded active running libcrun container NOTE: no libpod-conmon-xxx is found in here. [root@ibm-x3650m4-01-vm-02 system]# journalctl --since 10:40:00 -t systemd -t sh | grep -e service -e f9ef0d2a2255 -e c13ade2a0a18 Feb 23 10:46:42 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Starting service A... Feb 23 10:46:42 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Starting service B... Feb 23 10:46:42 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Started service A. Feb 23 10:46:42 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Starting Mid-service time checkpoint... Feb 23 10:46:42 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Started Mid-service time checkpoint. Feb 23 10:46:42 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Started service B. Feb 23 10:46:45 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Stopping service B... Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: libpod-c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6.scope: Succeeded. Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[49951]: var-lib-containers-storage-overlay\x2dcontainers-c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6-userdata-shm.mount: Succeeded. Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: var-lib-containers-storage-overlay\x2dcontainers-c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6-userdata-shm.mount: Succeeded. Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[5100]: var-lib-containers-storage-overlay\x2dcontainers-c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6-userdata-shm.mount: Succeeded. Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com sh[67534]: c13ade2a0a1889869826dec21a221daacf2058acc2a32cd2729f2bdd465427e6 Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: service_b.service: Succeeded. Feb 23 10:47:06 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Stopped service B. Feb 23 10:47:30 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Starting service B... Feb 23 10:47:30 ibm-x3650m4-01-vm-02.ibm2.lab.eng.bos.redhat.com systemd[1]: Started service B. NOTE: I ran 'systemctl daemon-reload' and 'systemctl enable service_a service_b mid_service --now' two times, because it has a typo in your reproducer(there are two 'service_a.service' w/o 'service_b.service'). And I also gave a tests for '--cgroups=split' command option. [root@ibm-x3650m4-01-vm-02 ~]# podman run --rm --cgroups=split quay.io/libpod/alpine cat /proc/self/cgroup Trying to pull quay.io/libpod/alpine:latest... Getting image source signatures Copying blob 9d16cba9fb96 done Copying config 9617696764 done Writing manifest to image destination Storing signatures 12:hugetlb:/user.slice/user-0.slice/session-11.scope/container 11:rdma:/ 10:devices:/user.slice/user-0.slice/session-11.scope/container 9:memory:/user.slice/user-0.slice/session-11.scope/container 8:freezer:/user.slice/user-0.slice/session-11.scope/container 7:cpu,cpuacct:/user.slice/user-0.slice/session-11.scope/container 6:net_cls,net_prio:/user.slice/user-0.slice/session-11.scope/container 5:cpuset:/user.slice/user-0.slice/session-11.scope/container 4:perf_event:/user.slice/user-0.slice/session-11.scope/container 3:pids:/user.slice/user-0.slice/session-11.scope/container 2:blkio:/user.slice/user-0.slice/session-11.scope/container 1:name=systemd:/user.slice/user-0.slice/session-11.scope/supervisor Hey Alex, comment #6 was a big unclear, so let me restate what we want to verify. Given the three order-dependent services A, B and mid from comment #6, we want the following to always work: 1. During a shutdown - When _all_ services are stopped at the same time - the stopping of service A should always take place after service B has fully stopped. So the test in comment #39 is not enough, you should perform a `systemctl reboot` of the node and verify that the logs stopped the services in the right order. 2. Valentin added another good point in comment #27: once the node has been restarted, we want to ensure that the start/stop dependencies are still enforced by systemd. So doing a second `systemctl reboot` should also stop service A only once service B is fully stopped. Thank you Damien! Moving this bug to VERIFIED state according to Damien's testing. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1796 |