Description of problem:
neutron-keepalived-qrouter container and neutron-haproxy-qrouter container are not removed even if a correspond router is deleted.
~~~
(test) [stack@undercloud ~]$ openstack router delete a2dc8d5e-2806-4b94-8483-1515377c78b7
[root@overcloud-controller-2 ~]# podman ps -a |grep a2dc8d5e-2806-4b94-8483-1515377c78b7
dc212cfbf236 undercloud.ctlplane.yamato.example.com:8787/rhosp-rhel8/openstack-neutron-l3-agent:16.2 /usr/sbin/keepali... 2 minutes ago Exited (0) About a minute ago neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7
a9b7500be6fd undercloud.ctlplane.yamato.example.com:8787/rhosp-rhel8/openstack-neutron-l3-agent:16.2 /bin/bash -c HAPR... 2 minutes ago Exited (143) About a minute ago neutron-haproxy-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7
~~~
Only SIGTERM was sent.
The containers are not removed
~~~
[root@overcloud-controller-2 ~]# grep a2dc8d5e-2806-4b94-8483-1515377c78b7 /var/log/containers/neutron/kill-script.log
Mon May 2 03:42:16 UTC 2022 Sending signal 'HUP' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71)
Mon May 2 03:42:26 UTC 2022 Sending signal 'HUP' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71)
Mon May 2 03:42:36 UTC 2022 Sending signal '15' to neutron-haproxy-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (a9b7500be6fdf6f433b8c34d05fa63b04fac5539f5cf0fb46ac46026668a3723)
Mon May 2 03:42:38 UTC 2022 Sending signal '15' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71)
~~~
The containers are removed or sent a signal by the following script.
https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/deployment/neutron/kill-script
The above script is invoked by the following codes.
- https://github.com/openstack/neutron/blob/050273ca210e5d4a08d39bf7012b15e929844cf6/neutron/agent/linux/external_process.py#L128
- https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/external_process.py#L102
- https://github.com/openstack/neutron/blob/04e345f2d5788ccde1567084ca8d6a6e35e080fa/neutron/agent/metadata/driver.py#L250-L268
- https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/keepalived.py#L457-L470
If I understand correctly, only SIGTERM will be sent and the containers are not removed.
https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/keepalived.py#L457-L470
~~~
def disable(self):
self.process_monitor.unregister(uuid=self.resource_id,
service_name=KEEPALIVED_SERVICE_NAME)
pm = self.get_process()
pm.disable(sig=str(int(signal.SIGTERM))) <================(*) send SIGTERM
try:
utils.wait_until_true(lambda: not pm.active,
timeout=SIGTERM_TIMEOUT)
except utils.WaitTimeout:
LOG.warning('Keepalived process %s did not finish after SIGTERM '
'signal in %s seconds, sending SIGKILL signal',
pm.pid, SIGTERM_TIMEOUT)
pm.disable(sig=str(int(signal.SIGKILL))) <=================(*)remove container. But this is not called if the SIGTERM was sent correctly.
~~~
Version-Release number of selected component (if applicable):
RHOSP 16.2.1 (my customer's env)
RHOSP 16.2.2 (my lab env)
How reproducible:
Steps to Reproduce:
1. Create an router
$ openstack router create router1
2. Add a subnet
$ openstack router add subnet router1 subnet1
3. Remove the subnet
$ openstack router remove subnet router1 subnet1
4. Delete the router
$ openstack router delete router1
5. Login to a Controller node
6. The containers corresponding to the deleted router still remain
$ sudo podman ps -a |grep a2dc8d5e-2806-4b94-8483-1515377c78b7
Actual results:
The containers were removed
Expected results:
The containers remain
Additional info:
The following bugs are similar to this.
But the version is different.
- https://bugzilla.redhat.com/show_bug.cgi?id=1839071
- https://bugzilla.redhat.com/show_bug.cgi?id=1816657
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2023:6307