Bug 1993364 - cluster destruction fails to remove router in BYON with Kuryr as primary network (even after BZ 1940159 got fixed)
Summary: cluster destruction fails to remove router in BYON with Kuryr as primary netw...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.9
Hardware: All
OS: All
high
high
Target Milestone: ---
: 4.10.0
Assignee: Maysa Macedo
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks: 2004569
TreeView+ depends on / blocked
 
Reported: 2021-08-12 20:28 UTC by Emilien Macchi
Modified: 2022-03-10 16:05 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:05:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5191 0 None Merged Bug 1993364: openstack/destroy: fix Kuryr/BYON 2021-09-14 08:57:54 UTC
Github openshift installer pull 5193 0 None Merged Bug 1993364: Consider all Networks on Router clean up 2021-09-14 08:57:55 UTC
Red Hat Product Errata RHSA-2022:0056 0 None Closed JBCS Httpd 2.4.48 availability 2022-05-10 16:57:14 UTC

Description Emilien Macchi 2021-08-12 20:28:20 UTC
Version:

$ openshift-install version
4.9

Platform:
OpenStack / IPI / Kuryr / BYON (using a pre-provisioned router and network for machines)

What happened?

Even after BZ 1940159 we can still observe that subnets are not removed from a router when this router is created by an operator (not the installer). Which makes the openshift-install destroy command to fail, since not all resources were removed on time:

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/21092/rehearse-21092-pull-ci-openshift-kuryr-kubernetes-master-e2e-openstack-kuryr/1425833083726204928/artifacts/e2e-openstack-kuryr/ipi-deprovision-deprovision/build-log.txt

And later the BYON router to be removed:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/21092/rehearse-21092-pull-ci-openshift-kuryr-kubernetes-master-e2e-openstack-kuryr/1425833083726204928/artifacts/e2e-openstack-kuryr/openstack-deprovision-machinesubnet/build-log.txt

Comment 1 Emilien Macchi 2021-08-16 00:40:53 UTC
How to reproduce it:

* Deploy OCP (e.g. 4.9), with networkType set to Kuryr
* In install-config, use a pre-provisionned network & router (using machinesSubnet)

After a successful deployment, run the destroy command.
The command will finish but won't actually remove the subnets from the pre provisionned router, and therefore won't remove the networks.

This leaves a lot of unused resources on the OpenStack cloud.

Comment 2 Emilien Macchi 2021-08-16 00:55:53 UTC
FYI I'm using https://github.com/openshift/release/pull/21092 to reproduce it in CI.

Comment 35 Jon Uriarte 2021-09-15 14:45:22 UTC
Verified in 4.10.0-0.nightly-2021-09-15-031230 on top of OSP 16.1.6 (RHOS-16.1-RHEL-8-20210818.n.0).

# Create a network, a subnet and a router
openstack network create net1
openstack subnet create subnet1 --network net1 --subnet-range 172.18.0.0/24 --allocation-pool start=172.18.0.10,end=172.18.0.254 --dns-nameserver 10.46.0.31
openstack router create router1
openstack router add subnet router1 subnet1
openstack router set router1 --external-gateway nova

# Create a port in net1 to be attached to the installer_host
openstack port create --network net1 --security-group installer_host-sg --host <installer_host ID>

# Update the install-config:

networking:
  clusterNetworks:
  - cidr:             10.128.0.0/14
    hostSubnetLength: 9
  serviceCIDR: 172.30.0.0/16
  machineCIDR: 172.18.0.0/24
  type: "Kuryr"|"OpenShiftSDN"
platform:
  openstack:
    cloud:            "shiftstack"
    externalNetwork:  ""
    region:           "regionOne"
    computeFlavor:    "m4.xlarge"
    machinesSubnet: d2257ad6-b68a-4a6c-ae55-7eefe5d7e03d
    apiVIP: "172.18.0.5"
    ingressVIP: "172.18.0.7"


# Update the /etc/hosts:

172.18.0.5 api.ostest.shiftstack.com
172.18.0.7 oauth-openshift.apps.ostest.shiftstack.com
172.18.0.7 console-openshift-console.apps.ostest.shiftstack.com
172.18.0.7 downloads-openshift-console.apps.ostest.shiftstack.com
172.18.0.7 canary-openshift-ingress-canary.apps.ostest.shiftstack.com
172.18.0.7 alertmanager-main-openshift-monitoring.apps.ostest.shiftstack.com
172.18.0.7 grafana-openshift-monitoring.apps.ostest.shiftstack.com
172.18.0.7 prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com
172.18.0.7 thanos-querier-openshift-monitoring.apps.ostest.shiftstack.com


Run the installer with Kuryr:
$ openshift-install create cluster --dir=ostest

INFO Install complete!

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-09-15-031230   True        False         5m47s   Cluster version is 4.10.0-0.nightly-2021-09-15-031230


Create a LB type svc:
oc new-project test1-ns
oc create deployment test1-dep --image=quay.io/kuryr/demo
oc scale deployments/test1-dep --replicas=2
oc expose deployment test1-dep --name test1-svc --type=LoadBalancer --port 80 --target-port=8080

Check the svc creation in OCP (it needs to take a fip from OSP):
$ oc get svc -o wide
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)        AGE     SELECTOR
test1-svc   LoadBalancer   172.30.116.172   10.46.22.249   80:32512/TCP   4m18s   app=test1-dep

Check the LB and fip creation in OSP:
$ openstack loadbalancer list | grep test
| 6888f5bb-eb07-407f-8f77-cd729a039511 | test1-ns/test1-svc                                                           | 5b27b448f1964691905a79b758eb4f6a | 172.30.116.172 | ACTIVE              | ovn      |

$ openstack floating ip list
+--------------------------------------+---------------------+------------------+--------------------------------------+--------------------------------------+----------------------------------+
| ID                                   | Floating IP Address | Fixed IP Address | Port                                 | Floating Network                     | Project                          |
+--------------------------------------+---------------------+------------------+--------------------------------------+--------------------------------------+----------------------------------+
...
| e92a1528-950d-4144-a863-688f57956d45 | 10.46.22.249        | 172.30.116.172   | bf3aa82b-6d9a-41df-ad93-a356f42e9b26 | b1b1d60d-ebda-4999-a8fd-673fe8a83083 | 5b27b448f1964691905a79b758eb4f6a |
+--------------------------------------+---------------------+------------------+--------------------------------------+--------------------------------------+----------------------------------+


Destroy the cluster:
$ openshift-install destroy cluster --dir=ostest

INFO Time elapsed: 8m23s

Check all the installer-created subnets were removed:
$ openstack subnet list
+--------------------------------------+-----------------------+--------------------------------------+----------------+
| ID                                   | Name                  | Network                              | Subnet         |
+--------------------------------------+-----------------------+--------------------------------------+----------------+
| 1234e81f-bbb0-4888-bb24-8c066f5a69f1 | installer_host-subnet | 3316ac0d-34d7-48b2-b461-d4efa88a0c6b | 172.16.40.0/24 |
| b28e0dad-2094-43bd-b5be-de900299c306 | restricted_subnet     | c299ca00-b7da-46a4-b23a-cf98fc8f5f47 | 172.16.0.0/24  |
| d2257ad6-b68a-4a6c-ae55-7eefe5d7e03d | subnet1               | 162aa054-a2e2-4600-a47e-5c737d3493b5 | 172.18.0.0/24  |
+--------------------------------------+-----------------------+--------------------------------------+----------------+

The above subnets are the pre-created ones so it's ok.

Check the fip for the LB type svc has been removed as well.

Comment 36 ShiftStack Bugwatcher 2021-11-25 16:12:06 UTC
Removing the Triaged keyword because:

* the QE automation assessment (flag qe_test_coverage) is missing

Comment 39 errata-xmlrpc 2022-03-10 16:05:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.