Description of problem: Due to an OVN bug on OSP 16.1.6 it's possible to have Subports with DOWN status that will not be re-used by Kuryr even when a re-population have happened upon controller restart. We need to make sure those Ports are cleaned up by detaching and deleting them. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Verified in OCP 4.10.0-0.nightly-2021-11-09-181140 on top of OSP 16.1.6 (RHOS-16.1-RHEL-8-20210506.n.1) with Kuryr: - Confirm kuryr pods are up and running: (shiftstack) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr NAME READY STATUS RESTARTS AGE kuryr-cni-gpznq 1/1 Running 0 4d kuryr-cni-k4nht 1/1 Running 0 4d kuryr-cni-kdbqr 1/1 Running 1 (4d ago) 4d kuryr-cni-t744c 1/1 Running 0 4d kuryr-cni-ts7dv 1/1 Running 0 4d kuryr-controller-dd978fd98-f69mf 1/1 Running 0 3d22h - pick one subport with: (shiftstack) [stack@undercloud-0 ~]$ openstack port list --device-owner "trunk:subport" -f value | tail -1 fd6b8679-0dd9-470b-a2d1-71574a16c7f6 fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] ACTIVE - Connect to the database and move the port to DOWN status: > update ports set status='DOWN' where id='fd6b8679-0dd9-470b-a2d1-71574a16c7f6'; - Confirm that the port appears as DOWN: (shiftstack) [stack@undercloud-0 ~]$ openstack port list --device-owner "trunk:subport" -f value | tail -1 fd6b8679-0dd9-470b-a2d1-71574a16c7f6 fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] DOWN - Restart kuryr-controller: (shiftstack) [stack@undercloud-0 ~]$ oc delete pod -n openshift-kuryr -l app=kuryr-controller pod "kuryr-controller-dd978fd98-f69mf" deleted - Wait until the pod is up again: (shiftstack) [stack@undercloud-0 ~]$ oc get pods -n openshift-kuryr NAME READY STATUS RESTARTS AGE kuryr-cni-gpznq 1/1 Running 0 4d1h kuryr-cni-k4nht 1/1 Running 0 4d kuryr-cni-kdbqr 1/1 Running 1 (4d ago) 4d1h kuryr-cni-t744c 1/1 Running 0 4d1h kuryr-cni-ts7dv 1/1 Running 0 4d kuryr-controller-dd978fd98-f8xdz 0/1 Running 0 55s - After ~15 minutes, the port disappeared, and there isn't a port on a DOWN status: (shiftstack) [stack@undercloud-0 ~]$ date; openstack port list --device-owner "trunk:subport" -f value | grep DOWN Mon Nov 15 04:06:14 EST 2021 fd6b8679-0dd9-470b-a2d1-71574a16c7f6 fa:16:3e:ff:9a:bd [{'subnet_id': 'a8309c8e-6a30-4540-be6c-fa7b6afd1021', 'ip_address': '10.128.92.8'}] DOWN (shiftstack) [stack@undercloud-0 ~]$ date; openstack port list --device-owner "trunk:subport" -f value | grep DOWN Mon Nov 15 04:06:48 EST 2021 (shiftstack) [stack@undercloud-0 ~]$ openstack port show fd6b8679-0dd9-470b-a2d1-71574a16c7f6 No Port found for fd6b8679-0dd9-470b-a2d1-71574a16c7f6
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056