2020664 – DOWN subports are not cleaned up

Bug 2020664 - DOWN subports are not cleaned up

Summary: DOWN subports are not cleaned up

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.10.0
Assignee:	Maysa Macedo
QA Contact:	Itay Matza
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2023731
TreeView+	depends on / blocked

Reported:	2021-11-05 14:40 UTC by Maysa Macedo
Modified:	2022-03-10 16:25 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2023731 (view as bug list)
Environment:
Last Closed:	2022-03-10 16:25:33 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift kuryr-kubernetes pull 594	0	None	open	Bug 2020664: Ensure DOWN subports are cleaned up	2021-11-05 14:44:28 UTC
Red Hat Product Errata	RHSA-2022:0056	0	None	None	None	2022-03-10 16:25:52 UTC

Description Maysa Macedo 2021-11-05 14:40:21 UTC

Description of problem:

Due to an OVN bug on OSP 16.1.6 it's possible to have Subports with DOWN status that will not be re-used by Kuryr even when a re-population have happened upon controller restart. We need to make sure those Ports are cleaned up by detaching and deleting them.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Itay Matza 2021-11-15 14:32:34 UTC

Verified in OCP 4.10.0-0.nightly-2021-11-09-181140 on top of OSP 16.1.6 (RHOS-16.1-RHEL-8-20210506.n.1) with Kuryr:

- Confirm kuryr pods are up and running:

	(shiftstack) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr                                                                                                                                         
	NAME                               READY   STATUS    RESTARTS     AGE
	kuryr-cni-gpznq                    1/1     Running   0            4d
	kuryr-cni-k4nht                    1/1     Running   0            4d
	kuryr-cni-kdbqr                    1/1     Running   1 (4d ago)   4d
	kuryr-cni-t744c                    1/1     Running   0            4d
	kuryr-cni-ts7dv                    1/1     Running   0            4d
	kuryr-controller-dd978fd98-f69mf   1/1     Running   0            3d22h


- pick one subport with:

	(shiftstack) [stack@undercloud-0 ~]$ openstack port list --device-owner "trunk:subport" -f value | tail -1
	fd6b8679-0dd9-470b-a2d1-71574a16c7f6  fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] ACTIVE


- Connect to the database and move the port to DOWN status:

	> update ports set status='DOWN' where id='fd6b8679-0dd9-470b-a2d1-71574a16c7f6';                                                                                                            


- Confirm that the port appears as DOWN:

	(shiftstack) [stack@undercloud-0 ~]$ openstack port list --device-owner "trunk:subport" -f value | tail -1
	fd6b8679-0dd9-470b-a2d1-71574a16c7f6  fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] DOWN


- Restart kuryr-controller:

	(shiftstack) [stack@undercloud-0 ~]$ oc delete pod -n openshift-kuryr -l app=kuryr-controller
	pod "kuryr-controller-dd978fd98-f69mf" deleted


- Wait until the pod is up again:

	(shiftstack) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr
	NAME                               READY   STATUS    RESTARTS     AGE
	kuryr-cni-gpznq                    1/1     Running   0            4d1h
	kuryr-cni-k4nht                    1/1     Running   0            4d
	kuryr-cni-kdbqr                    1/1     Running   1 (4d ago)   4d1h
	kuryr-cni-t744c                    1/1     Running   0            4d1h
	kuryr-cni-ts7dv                    1/1     Running   0            4d
	kuryr-controller-dd978fd98-f8xdz   0/1     Running   0            55s


- After ~15 minutes, the port disappeared, and there isn't a port on a DOWN status:

	(shiftstack) [stack@undercloud-0 ~]$ date; openstack port list --device-owner "trunk:subport" -f value | grep DOWN
	Mon Nov 15 04:06:14 EST 2021
	fd6b8679-0dd9-470b-a2d1-71574a16c7f6  fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] DOWN
	

	(shiftstack) [stack@undercloud-0 ~]$ date; openstack port list --device-owner "trunk:subport" -f value | grep DOWN
	Mon Nov 15 04:06:48 EST 2021

	(shiftstack) [stack@undercloud-0 ~]$ openstack port show fd6b8679-0dd9-470b-a2d1-71574a16c7f6
        No Port found for fd6b8679-0dd9-470b-a2d1-71574a16c7f6

Comment 11 errata-xmlrpc 2022-03-10 16:25:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Note You need to log in before you can comment on or make changes to this bug.