Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2080811

Summary: [ML2/OVS] neutron-keepalived-qrouter container and neutron-haproxy-qrouter container are not removed even if a correspond router is deleted.
Product: Red Hat OpenStack Reporter: yatanaka
Component: openstack-tripleo-heat-templatesAssignee: Slawek Kaplonski <skaplons>
Status: CLOSED ERRATA QA Contact: Maor <mblue>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: averdagu, beagles, chrisw, ekuris, gthiemon, mblue, mburns, scohen, skaplons, tvignaud, ykarel
Target Milestone: z5Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)Flags: skaplons: needinfo-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20230717085025.1608f56.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 19:18:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yatanaka 2022-05-02 03:52:19 UTC
Description of problem:

neutron-keepalived-qrouter container and neutron-haproxy-qrouter container are not removed even if a correspond router is deleted.

~~~
(test) [stack@undercloud ~]$ openstack router delete a2dc8d5e-2806-4b94-8483-1515377c78b7

[root@overcloud-controller-2 ~]# podman ps -a  |grep a2dc8d5e-2806-4b94-8483-1515377c78b7
dc212cfbf236  undercloud.ctlplane.yamato.example.com:8787/rhosp-rhel8/openstack-neutron-l3-agent:16.2           /usr/sbin/keepali...  2 minutes ago      Exited (0) About a minute ago            neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7
a9b7500be6fd  undercloud.ctlplane.yamato.example.com:8787/rhosp-rhel8/openstack-neutron-l3-agent:16.2           /bin/bash -c HAPR...  2 minutes ago      Exited (143) About a minute ago          neutron-haproxy-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7
~~~

Only SIGTERM was sent.
The containers are not removed

~~~
[root@overcloud-controller-2 ~]# grep a2dc8d5e-2806-4b94-8483-1515377c78b7 /var/log/containers/neutron/kill-script.log 
Mon May  2 03:42:16 UTC 2022 Sending signal 'HUP' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71)
Mon May  2 03:42:26 UTC 2022 Sending signal 'HUP' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71)
Mon May  2 03:42:36 UTC 2022 Sending signal '15' to neutron-haproxy-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (a9b7500be6fdf6f433b8c34d05fa63b04fac5539f5cf0fb46ac46026668a3723)
Mon May  2 03:42:38 UTC 2022 Sending signal '15' to neutron-keepalived-qrouter-a2dc8d5e-2806-4b94-8483-1515377c78b7 (dc212cfbf2360fa852c8e7d1bc634764fd1be2053e2163f21f2e95625584de71)
~~~

The containers are removed or sent a signal by the following script.
https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/deployment/neutron/kill-script

The above script is invoked by the following codes.
  - https://github.com/openstack/neutron/blob/050273ca210e5d4a08d39bf7012b15e929844cf6/neutron/agent/linux/external_process.py#L128
  - https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/external_process.py#L102
  - https://github.com/openstack/neutron/blob/04e345f2d5788ccde1567084ca8d6a6e35e080fa/neutron/agent/metadata/driver.py#L250-L268
  - https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/keepalived.py#L457-L470

If I understand correctly, only SIGTERM will be sent and the containers are not removed.

https://github.com/openstack/neutron/blob/stable/train/neutron/agent/linux/keepalived.py#L457-L470
~~~
    def disable(self):
        self.process_monitor.unregister(uuid=self.resource_id,
                                        service_name=KEEPALIVED_SERVICE_NAME)

        pm = self.get_process()
        pm.disable(sig=str(int(signal.SIGTERM))) <================(*) send SIGTERM
        try:
            utils.wait_until_true(lambda: not pm.active,
                                  timeout=SIGTERM_TIMEOUT)
        except utils.WaitTimeout:
            LOG.warning('Keepalived process %s did not finish after SIGTERM '
                        'signal in %s seconds, sending SIGKILL signal',
                        pm.pid, SIGTERM_TIMEOUT)
            pm.disable(sig=str(int(signal.SIGKILL))) <=================(*)remove container. But this is not called if the SIGTERM was sent correctly.
~~~





Version-Release number of selected component (if applicable):

RHOSP 16.2.1 (my customer's env)
RHOSP 16.2.2 (my lab env)


How reproducible:

Steps to Reproduce:
1. Create an router
  $ openstack router create router1
2. Add a subnet
  $ openstack router add subnet router1 subnet1
3. Remove the subnet
  $ openstack router remove subnet router1 subnet1
4. Delete the router
  $ openstack router delete router1
5. Login to a Controller node
6. The containers corresponding to the deleted router still remain
  $ sudo podman ps -a  |grep a2dc8d5e-2806-4b94-8483-1515377c78b7


Actual results:

The containers were removed


Expected results:

The containers remain


Additional info:

The following bugs are similar to this.
But the version is different.

  - https://bugzilla.redhat.com/show_bug.cgi?id=1839071
  - https://bugzilla.redhat.com/show_bug.cgi?id=1816657

Comment 1 Takashi Kajinami 2022-05-03 06:19:38 UTC
iiuc this is expected behavior and these orphaned containers are deleted when a new sidecar container is started.
 https://github.com/openstack/puppet-tripleo/blob/stable/train/templates/neutron/keepalived.epp#L36-L45
 https://github.com/openstack/puppet-tripleo/blob/stable/train/templates/neutron/haproxy.epp#L37-L46

Comment 6 Lon Hohberger 2023-01-10 11:33:03 UTC
According to our records, this should be resolved by openstack-tripleo-heat-templates-11.6.1-2.20221010235135.el8ost.  This build is available now.

Comment 24 errata-xmlrpc 2023-11-08 19:18:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6307