Description of problem: neutron-keepalived-qrouter container and neutron-haproxy-qrouter container are not removed even if a correspond router is deleted. ~~~ (test) [stack@undercloud ~]$ openstack router delete a2dc8d5e-2806-4b94-8483-1515377c78b7 [root@overcloud-controller-2 ~]# podman ps -a |grep a2dc8d5e-2806-4b94-8483-1515377c78b7 dc212cfbf236 undercloud.ctlplane.yamato.example.com:8787/rhosp-rhel8/openstack-neutron-l3-agent:16.2 /usr/sbin/keepali... 2 minutes ago Exited (0) About a minute ago neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 a9b7500be6fd undercloud.ctlplane.yamato.example.com:8787/rhosp-rhel8/openstack-neutron-l3-agent:16.2 /bin/bash -c HAPR... 2 minutes ago Exited (143) About a minute ago neutron-haproxy-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 ~~~ Only SIGTERM was sent. The containers are not removed ~~~ [root@overcloud-controller-2 ~]# grep a2dc8d5e-2806-4b94-8483-1515377c78b7 /var/log/containers/neutron/kill-script.log Mon May 2 03:42:16 UTC 2022 Sending signal 'HUP' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71) Mon May 2 03:42:26 UTC 2022 Sending signal 'HUP' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71) Mon May 2 03:42:36 UTC 2022 Sending signal '15' to neutron-haproxy-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (a9b7500be6fdf6f433b8c34d05fa63b04fac5539f5cf0fb46ac46026668a3723) Mon May 2 03:42:38 UTC 2022 Sending signal '15' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71) ~~~ The containers are removed or sent a signal by the following script. https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/deployment/neutron/kill-script The above script is invoked by the following codes. - https://github.com/openstack/neutron/blob/050273ca210e5d4a08d39bf7012b15e929844cf6/neutron/agent/linux/external_process.py#L128 - https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/external_process.py#L102 - https://github.com/openstack/neutron/blob/04e345f2d5788ccde1567084ca8d6a6e35e080fa/neutron/agent/metadata/driver.py#L250-L268 - https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/keepalived.py#L457-L470 If I understand correctly, only SIGTERM will be sent and the containers are not removed. https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/keepalived.py#L457-L470 ~~~ def disable(self): self.process_monitor.unregister(uuid=self.resource_id, service_name=KEEPALIVED_SERVICE_NAME) pm = self.get_process() pm.disable(sig=str(int(signal.SIGTERM))) <================(*) send SIGTERM try: utils.wait_until_true(lambda: not pm.active, timeout=SIGTERM_TIMEOUT) except utils.WaitTimeout: LOG.warning('Keepalived process %s did not finish after SIGTERM ' 'signal in %s seconds, sending SIGKILL signal', pm.pid, SIGTERM_TIMEOUT) pm.disable(sig=str(int(signal.SIGKILL))) <=================(*)remove container. But this is not called if the SIGTERM was sent correctly. ~~~ Version-Release number of selected component (if applicable): RHOSP 16.2.1 (my customer's env) RHOSP 16.2.2 (my lab env) How reproducible: Steps to Reproduce: 1. Create an router $ openstack router create router1 2. Add a subnet $ openstack router add subnet router1 subnet1 3. Remove the subnet $ openstack router remove subnet router1 subnet1 4. Delete the router $ openstack router delete router1 5. Login to a Controller node 6. The containers corresponding to the deleted router still remain $ sudo podman ps -a |grep a2dc8d5e-2806-4b94-8483-1515377c78b7 Actual results: The containers were removed Expected results: The containers remain Additional info: The following bugs are similar to this. But the version is different. - https://bugzilla.redhat.com/show_bug.cgi?id=1839071 - https://bugzilla.redhat.com/show_bug.cgi?id=1816657
iiuc this is expected behavior and these orphaned containers are deleted when a new sidecar container is started. https://github.com/openstack/puppet-tripleo/blob/stable/train/templates/neutron/keepalived.epp#L36-L45 https://github.com/openstack/puppet-tripleo/blob/stable/train/templates/neutron/haproxy.epp#L37-L46
According to our records, this should be resolved by openstack-tripleo-heat-templates-11.6.1-2.20221010235135.el8ost. This build is available now.