Bug 1832720
Summary: | Stale neutron-haproxy-qdhcp containers after network delete in OVS deployed RHSOP-16 | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Srinivas Atmakuri <satmakur> |
Component: | openstack-tripleo-heat-templates | Assignee: | Bernard Cafarelli <bcafarel> |
Status: | CLOSED ERRATA | QA Contact: | Alex Katz <akatz> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 16.0 (Train) | CC: | amcleod, amoralej, amuller, bcafarel, bperkins, chrisw, mburns, njohnston, scohen |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 16.1 (Train on RHEL 8.2) | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-11.3.2-0.20200603154815.1e9ce95.el8ost | Doc Type: | Bug Fix |
Doc Text: |
Before this update, stale `neutron-haproxy-qdhcp-*` containers remained after you deleted the related network. With this update, all related containers are cleaned correctly when you delete a network.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-07-29 07:52:21 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Srinivas Atmakuri
2020-05-07 07:49:37 UTC
From quick check, this looks similar to bug #1816657 (for neutron-keepalived-qrouter sidecar containers) I cannot reproduce it on 16.0.2 (used to test bug #1816657): (overcloud) [stack@undercloud-0 ~]$ openstack network create test && openstack subnet create testsub --network test --subnet-range 172.24.4.0/24 --gateway 172.24.4.1 (overcloud) [stack@undercloud-0 ~]$ openstack subnet list --network +--------------------------------------+---------+--------------------------------------+---------------+ | ID | Name | Network | Subnet | +--------------------------------------+---------+--------------------------------------+---------------+ | 19f5a241-245a-4935-bcf3-d1149b41ee18 | testsub | 35556fee-bb67-41f3-84b5-edca9a88d7b7 | 172.24.4.0/24 | +--------------------------------------+---------+--------------------------------------+---------------+ On one controller: [root@controller-0 neutron]# podman ps|grep qdhcp-35556fee-bb67-41f3-84b5-edca9a88d7b7 0c519ee9f606 undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-dhcp-agent:20200416.1 /usr/sbin/dnsmasq... 5 minutes ago Up 5 minutes ago neutron-dnsmasq-qdhcp-35556fee-bb67-41f3-84b5-edca9a88d7b7 (overcloud) [stack@undercloud-0 ~]$ openstack network delete test (overcloud) [stack@undercloud-0 ~]$ openstack subnet show 19f5a241-245a-4935-bcf3-d1149b41ee18 No Subnet found for 19f5a241-245a-4935-bcf3-d1149b41ee18 And on controller (same on the 2 other nodes): [root@controller-0 neutron]# podman ps|grep qdhcp-35556fee-bb67-41f3-84b5-edca9a88d7b7 [root@controller-0 neutron]# So subnet and dnsmasq sidecar containers are properly deleted. Also just deleting the subnet also properly exist the sidecar container. Which specific 16.0 version is it? Also the kill-script log should be useful, it can be found in /var/log/containers/neutron/kill-script.log My bad, I just noticed I did not have haproxy sidecar containers in my test env. After enabling isolated metadata I can now see the issue and it is indeed different from keepalived sidecars from bug #1816657 (overcloud) [stack@undercloud-0 ~]$ openstack network create test && openstack subnet create testsub --network test --subnet-range 172.24.4.0/24 --gateway 172.24.4.1 On a controller: [root@controller-2 neutron]# podman ps|grep 87f55e6c-506d-4bb1-806b-3ede5b737be4 effcf247d9b6 undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-dhcp-agent:20200416.1 /bin/bash -c HAPR... 24 seconds ago Up 24 seconds ago neutron-haproxy-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 402df34a80d9 undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-dhcp-agent:20200416.1 /usr/sbin/dnsmasq... 26 seconds ago Up 26 seconds ago neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 And kill-script.log when running "openstack network delete test": + SIG=9 + PID=90994 ++ ip netns identify 90994 + NETNS=qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 + '[' xqdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 == x ']' + CLI='nsenter --net=/run/netns/qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 --preserve-credentials -m -t 1 podman' + '[' -f /proc/90994/cgroup ']' ++ awk 'BEGIN {FS="[-.]"} /name=/{print $3}' /proc/90994/cgroup + CT_ID=402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 ++ nsenter --net=/run/netns/qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 --preserve-credentials -m -t 1 podman inspect -f '{{.Name}}' 402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 + CT_NAME=neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 + case $SIG in + kill_container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 + add_date 'Stopping container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 (402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9)' ++ date + echo 'Thu May 28 09:20:57 UTC 2020 Stopping container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 (402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9)' Thu May 28 09:20:57 UTC 2020 Stopping container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 (402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9) + nsenter --net=/run/netns/qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 --preserve-credentials -m -t 1 podman stop 402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 + add_date 'Deleting container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 (402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9)' ++ date + echo 'Thu May 28 09:20:58 UTC 2020 Deleting container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 (402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9)' Thu May 28 09:20:58 UTC 2020 Deleting container neutron-dnsmasq-qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 (402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9) + nsenter --net=/run/netns/qdhcp-87f55e6c-506d-4bb1-806b-3ede5b737be4 --preserve-credentials -m -t 1 podman rm 402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 402df34a80d9eb142806ee2c999544c4cd8d02fb44aa6b377dbe61eba4910db9 + exec + SIG=9 + PID=91173 ++ ip netns identify 91173 + NETNS= + '[' x == x ']' + add_date 'No network namespace detected, exiting' ++ date + echo 'Thu May 28 09:21:02 UTC 2020 No network namespace detected, exiting' Thu May 28 09:21:02 UTC 2020 No network namespace detected, exiting + exit 1 + exec + SIG=9 + PID=91173 ++ ip netns identify 91173 + NETNS= + '[' x == x ']' + add_date 'No network namespace detected, exiting' ++ date + echo 'Thu May 28 09:21:02 UTC 2020 No network namespace detected, exiting' Thu May 28 09:21:02 UTC 2020 No network namespace detected, exiting + exit 1 + exec [root@controller-2 neutron]# ps 91173 PID TTY STAT TIME COMMAND 91173 ? S 0:00 /usr/sbin/haproxy -Ws -f /var/lib/neutron/ns-metadata-proxy/87f55e6c-506d-4bb1-806b-3ede5b737be4.conf So dnsmasq sidecar is stopped, namespace is deleted, but then it cannot be used anymore to delete the metadata proxy container I think this will be fixed by https://review.opendev.org/#/c/730831 testing locally (https://review.opendev.org/#/c/730744 being the parent fix for bug #1816657) With patch applied: + SIG=9 + PID=632420 ++ ip netns identify 632420 + NETNS= + '[' x == x ']' + CLI='nsenter --all --preserve-credentials -t 1 podman' + SIG=9 + '[' -f /proc/632420/cgroup ']' ++ awk 'BEGIN {FS="[-.]"} /name=/{print $3}' /proc/632420/cgroup + CT_ID=2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d ++ nsenter --all --preserve-credentials -t 1 podman inspect -f '{{.Name}}' 2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d + CT_NAME=neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 + case $SIG in + kill_container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d + add_date 'Stopping container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 (2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d)' ++ date + echo 'Thu May 28 11:11:38 UTC 2020 Stopping container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 (2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d)' Thu May 28 11:11:38 UTC 2020 Stopping container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 (2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d) + nsenter --all --preserve-credentials -t 1 podman stop 2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d 2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d + add_date 'Deleting container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 (2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d)' ++ date + echo 'Thu May 28 11:11:39 UTC 2020 Deleting container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 (2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d)' Thu May 28 11:11:39 UTC 2020 Deleting container neutron-haproxy-qdhcp-5452b2e9-d6f7-4ce0-8430-6b48c4fb8304 (2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d) + nsenter --all --preserve-credentials -t 1 podman rm 2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d 2f52e12e2f0b741e82da3ce15f0a978d04635bf55ea214c1772cbb402fe3173d + exec And no remaining container linked to network ID remains on controller node If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3148 |