Bug 1993364

Summary: cluster destruction fails to remove router in BYON with Kuryr as primary network (even after BZ 1940159 got fixed)
Product: OpenShift Container Platform Reporter: Emilien Macchi <emacchi>
Component: InstallerAssignee: Maysa Macedo <mdemaced>
Installer sub component: OpenShift on OpenStack QA Contact: Jon Uriarte <juriarte>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: dahernan, mdemaced
Version: 4.9   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: All   
OS: All   
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:05:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 2004569    

Description Emilien Macchi 2021-08-12 20:28:20 UTC

$ openshift-install version

OpenStack / IPI / Kuryr / BYON (using a pre-provisioned router and network for machines)

What happened?

Even after BZ 1940159 we can still observe that subnets are not removed from a router when this router is created by an operator (not the installer). Which makes the openshift-install destroy command to fail, since not all resources were removed on time:


And later the BYON router to be removed:

Comment 1 Emilien Macchi 2021-08-16 00:40:53 UTC
How to reproduce it:

* Deploy OCP (e.g. 4.9), with networkType set to Kuryr
* In install-config, use a pre-provisionned network & router (using machinesSubnet)

After a successful deployment, run the destroy command.
The command will finish but won't actually remove the subnets from the pre provisionned router, and therefore won't remove the networks.

This leaves a lot of unused resources on the OpenStack cloud.

Comment 2 Emilien Macchi 2021-08-16 00:55:53 UTC
FYI I'm using https://github.com/openshift/release/pull/21092 to reproduce it in CI.

Comment 35 Jon Uriarte 2021-09-15 14:45:22 UTC
Verified in 4.10.0-0.nightly-2021-09-15-031230 on top of OSP 16.1.6 (RHOS-16.1-RHEL-8-20210818.n.0).

# Create a network, a subnet and a router
openstack network create net1
openstack subnet create subnet1 --network net1 --subnet-range --allocation-pool start=,end= --dns-nameserver
openstack router create router1
openstack router add subnet router1 subnet1
openstack router set router1 --external-gateway nova

# Create a port in net1 to be attached to the installer_host
openstack port create --network net1 --security-group installer_host-sg --host <installer_host ID>

# Update the install-config:

  - cidr:   
    hostSubnetLength: 9
  type: "Kuryr"|"OpenShiftSDN"
    cloud:            "shiftstack"
    externalNetwork:  ""
    region:           "regionOne"
    computeFlavor:    "m4.xlarge"
    machinesSubnet: d2257ad6-b68a-4a6c-ae55-7eefe5d7e03d
    apiVIP: ""
    ingressVIP: ""

# Update the /etc/hosts: api.ostest.shiftstack.com oauth-openshift.apps.ostest.shiftstack.com console-openshift-console.apps.ostest.shiftstack.com downloads-openshift-console.apps.ostest.shiftstack.com canary-openshift-ingress-canary.apps.ostest.shiftstack.com alertmanager-main-openshift-monitoring.apps.ostest.shiftstack.com grafana-openshift-monitoring.apps.ostest.shiftstack.com prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com thanos-querier-openshift-monitoring.apps.ostest.shiftstack.com

Run the installer with Kuryr:
$ openshift-install create cluster --dir=ostest

INFO Install complete!

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-09-15-031230   True        False         5m47s   Cluster version is 4.10.0-0.nightly-2021-09-15-031230

Create a LB type svc:
oc new-project test1-ns
oc create deployment test1-dep --image=quay.io/kuryr/demo
oc scale deployments/test1-dep --replicas=2
oc expose deployment test1-dep --name test1-svc --type=LoadBalancer --port 80 --target-port=8080

Check the svc creation in OCP (it needs to take a fip from OSP):
$ oc get svc -o wide
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)        AGE     SELECTOR
test1-svc   LoadBalancer   80:32512/TCP   4m18s   app=test1-dep

Check the LB and fip creation in OSP:
$ openstack loadbalancer list | grep test
| 6888f5bb-eb07-407f-8f77-cd729a039511 | test1-ns/test1-svc                                                           | 5b27b448f1964691905a79b758eb4f6a | | ACTIVE              | ovn      |

$ openstack floating ip list
| ID                                   | Floating IP Address | Fixed IP Address | Port                                 | Floating Network                     | Project                          |
| e92a1528-950d-4144-a863-688f57956d45 |        |   | bf3aa82b-6d9a-41df-ad93-a356f42e9b26 | b1b1d60d-ebda-4999-a8fd-673fe8a83083 | 5b27b448f1964691905a79b758eb4f6a |

Destroy the cluster:
$ openshift-install destroy cluster --dir=ostest

INFO Time elapsed: 8m23s

Check all the installer-created subnets were removed:
$ openstack subnet list
| ID                                   | Name                  | Network                              | Subnet         |
| 1234e81f-bbb0-4888-bb24-8c066f5a69f1 | installer_host-subnet | 3316ac0d-34d7-48b2-b461-d4efa88a0c6b | |
| b28e0dad-2094-43bd-b5be-de900299c306 | restricted_subnet     | c299ca00-b7da-46a4-b23a-cf98fc8f5f47 |  |
| d2257ad6-b68a-4a6c-ae55-7eefe5d7e03d | subnet1               | 162aa054-a2e2-4600-a47e-5c737d3493b5 |  |

The above subnets are the pre-created ones so it's ok.

Check the fip for the LB type svc has been removed as well.

Comment 36 ShiftStack Bugwatcher 2021-11-25 16:12:06 UTC
Removing the Triaged keyword because:

* the QE automation assessment (flag qe_test_coverage) is missing

Comment 39 errata-xmlrpc 2022-03-10 16:05:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.