Bug 2020664

Summary: DOWN subports are not cleaned up
Product: OpenShift Container Platform Reporter: Maysa Macedo <mdemaced>
Component: NetworkingAssignee: Maysa Macedo <mdemaced>
Networking sub component: kuryr QA Contact: Itay Matza <imatza>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: imatza, itbrown, rlobillo
Version: 4.9   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2023731 (view as bug list) Environment:
Last Closed: 2022-03-10 16:25:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2023731    

Description Maysa Macedo 2021-11-05 14:40:21 UTC
Description of problem:

Due to an OVN bug on OSP 16.1.6 it's possible to have Subports with DOWN status that will not be re-used by Kuryr even when a re-population have happened upon controller restart. We need to make sure those Ports are cleaned up by detaching and deleting them.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Itay Matza 2021-11-15 14:32:34 UTC
Verified in OCP 4.10.0-0.nightly-2021-11-09-181140 on top of OSP 16.1.6 (RHOS-16.1-RHEL-8-20210506.n.1) with Kuryr:

- Confirm kuryr pods are up and running:

	(shiftstack) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr                                                                                                                                         
	NAME                               READY   STATUS    RESTARTS     AGE
	kuryr-cni-gpznq                    1/1     Running   0            4d
	kuryr-cni-k4nht                    1/1     Running   0            4d
	kuryr-cni-kdbqr                    1/1     Running   1 (4d ago)   4d
	kuryr-cni-t744c                    1/1     Running   0            4d
	kuryr-cni-ts7dv                    1/1     Running   0            4d
	kuryr-controller-dd978fd98-f69mf   1/1     Running   0            3d22h


- pick one subport with:

	(shiftstack) [stack@undercloud-0 ~]$ openstack port list --device-owner "trunk:subport" -f value | tail -1
	fd6b8679-0dd9-470b-a2d1-71574a16c7f6  fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] ACTIVE


- Connect to the database and move the port to DOWN status:

	> update ports set status='DOWN' where id='fd6b8679-0dd9-470b-a2d1-71574a16c7f6';                                                                                                            


- Confirm that the port appears as DOWN:

	(shiftstack) [stack@undercloud-0 ~]$ openstack port list --device-owner "trunk:subport" -f value | tail -1
	fd6b8679-0dd9-470b-a2d1-71574a16c7f6  fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] DOWN


- Restart kuryr-controller:

	(shiftstack) [stack@undercloud-0 ~]$ oc delete pod -n openshift-kuryr -l app=kuryr-controller
	pod "kuryr-controller-dd978fd98-f69mf" deleted


- Wait until the pod is up again:

	(shiftstack) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr
	NAME                               READY   STATUS    RESTARTS     AGE
	kuryr-cni-gpznq                    1/1     Running   0            4d1h
	kuryr-cni-k4nht                    1/1     Running   0            4d
	kuryr-cni-kdbqr                    1/1     Running   1 (4d ago)   4d1h
	kuryr-cni-t744c                    1/1     Running   0            4d1h
	kuryr-cni-ts7dv                    1/1     Running   0            4d
	kuryr-controller-dd978fd98-f8xdz   0/1     Running   0            55s


- After ~15 minutes, the port disappeared, and there isn't a port on a DOWN status:

	(shiftstack) [stack@undercloud-0 ~]$ date; openstack port list --device-owner "trunk:subport" -f value | grep DOWN
	Mon Nov 15 04:06:14 EST 2021
	fd6b8679-0dd9-470b-a2d1-71574a16c7f6  fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] DOWN
	

	(shiftstack) [stack@undercloud-0 ~]$ date; openstack port list --device-owner "trunk:subport" -f value | grep DOWN
	Mon Nov 15 04:06:48 EST 2021

	(shiftstack) [stack@undercloud-0 ~]$ openstack port show fd6b8679-0dd9-470b-a2d1-71574a16c7f6
        No Port found for fd6b8679-0dd9-470b-a2d1-71574a16c7f6

Comment 11 errata-xmlrpc 2022-03-10 16:25:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056