Bug 1971518
| Summary: | Cluster deletion misses trunk ports and loop over until timeout | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Martin André <m.andre> |
| Component: | Installer | Assignee: | Martin André <m.andre> |
| Installer sub component: | OpenShift on OpenStack | QA Contact: | Udi Shkalim <ushkalim> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | urgent | CC: | mburke, rlobillo, ushkalim |
| Version: | 4.8 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: Missing tags for trunks
Consequence: the destroy command stuck in a loop until it hits the timeout because it misses the trunks and they cause other resources to not be deleted.
Fix: delete trunks for which the tagged port is a parent.
Result: the destroy command no longer only relies on trunk tags to know if a trunk should be deleted and can destroy clusters that don't have tagged trunk.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:33:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The trunk is missing a tag to identify it belongs to the cluster: moc-ci ❯ openstack network trunk show 6fpj4mqm-868b3-kg4xc-master-trunk-0 +-----------------+--------------------------------------+ | Field | Value | +-----------------+--------------------------------------+ | admin_state_up | UP | | created_at | 2021-05-25T06:05:38Z | | description | | | id | 8c6fb4fa-e011-4326-8679-4716de9f9dd8 | | name | 6fpj4mqm-868b3-kg4xc-master-trunk-0 | | port_id | f7fbec6a-14cb-4e3d-8e17-74e9a225dd9f | | project_id | 593227d1d5d04cba8847d5b6b742e0a7 | | revision_number | 0 | | status | DOWN | | sub_ports | | | tags | [] | | tenant_id | 593227d1d5d04cba8847d5b6b742e0a7 | | updated_at | 2021-05-25T06:05:38Z | +-----------------+--------------------------------------+ Not sure why this happened, but perhaps we should consider destroying trunks where ports belong to a cluster we destroy even when they are missing a tag? It appears this cluster was created from an UPI job, which is missing the port tag: https://github.com/openshift/installer/blob/e7fea15/upi/openstack/control-plane.yaml#L39 Verified on UPI: [cloud-user@installer-host ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-06-28-221420 True False 102m Cluster version is 4.9.0-0.nightly-2021-06-28-221420 (shiftstack) [cloud-user@installer-host ~]$ openstack network trunk show ostest-qcfxf-worker-trunk-0 +-----------------+--------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------+--------------------------------------------------------------------------------------------------+ | admin_state_up | UP | | created_at | 2021-06-29T09:55:26Z | | description | | | id | 01739a93-c3d0-4b54-8154-1e2f2c8d6f09 | | name | ostest-qcfxf-worker-trunk-0 | | port_id | e644887a-e75e-439c-8622-b8be4ce7d8f8 | | project_id | 3210dadc4c0e41f1bf8dacd64753ee33 | | revision_number | 25 | | status | ACTIVE | | sub_ports | port_id='353d59b3-3504-449f-8771-cf4197be80a7', segmentation_id='368', segmentation_type='vlan' | | | port_id='c5ecd08f-0d41-457a-941d-373123803753', segmentation_id='434', segmentation_type='vlan' | | | port_id='b9c0ad3a-e276-4638-b87e-4da3fcecfeaa', segmentation_id='576', segmentation_type='vlan' | | | port_id='c3a04072-8f92-4230-ae60-9f5ddeefe7c7', segmentation_id='807', segmentation_type='vlan' | | | port_id='751342b1-5888-40ff-beba-231a1dc92b48', segmentation_id='836', segmentation_type='vlan' | | | port_id='43640b43-7874-4238-9463-fe476277136c', segmentation_id='897', segmentation_type='vlan' | | | port_id='86db44a6-0cb9-430c-8417-db64ce2047b5', segmentation_id='898', segmentation_type='vlan' | | | port_id='3f87c680-3dc9-40f1-bb7c-15820e091262', segmentation_id='1007', segmentation_type='vlan' | | | port_id='d74bcf0f-df44-4335-b9e6-f7db62264608', segmentation_id='1011', segmentation_type='vlan' | | | port_id='1510c2a4-41e8-40af-bc33-933b8b8cf281', segmentation_id='1035', segmentation_type='vlan' | | | port_id='e5913a38-3d7e-408c-8c15-4f348896fc85', segmentation_id='1078', segmentation_type='vlan' | | | port_id='194a6100-8fed-4cfe-bf0f-38327b82ee44', segmentation_id='1152', segmentation_type='vlan' | | | port_id='f7105151-0782-4fda-ac03-8457104dbe68', segmentation_id='1177', segmentation_type='vlan' | | | port_id='4b96e97c-9007-4000-a395-04f800234ed3', segmentation_id='1399', segmentation_type='vlan' | | | port_id='662e5746-5068-4af9-88d2-f87bc726ea85', segmentation_id='1541', segmentation_type='vlan' | | | port_id='6551ee9c-66ea-4723-884c-16247e116c0f', segmentation_id='1716', segmentation_type='vlan' | | | port_id='3dbfb7fd-cb0e-478b-8123-a50153abd647', segmentation_id='1845', segmentation_type='vlan' | | | port_id='00ee1a41-b425-4549-8722-a32bd68ae53b', segmentation_id='1904', segmentation_type='vlan' | | | port_id='2409002b-488c-4f2a-9e51-4170d9272ae7', segmentation_id='1986', segmentation_type='vlan' | | | port_id='ea745a9f-bbb2-4574-9e70-19bed7f8c141', segmentation_id='2020', segmentation_type='vlan' | | | port_id='54eb0574-31bb-48cc-a74f-11b9ad455905', segmentation_id='2181', segmentation_type='vlan' | | | port_id='45f530cb-b058-4b02-bd4b-a7806dc07394', segmentation_id='2209', segmentation_type='vlan' | | | port_id='e2050b5e-ce2f-40aa-be87-2f4889e69bda', segmentation_id='2352', segmentation_type='vlan' | | | port_id='7e03d2d5-cee3-4d0f-b6f0-9d748fcae1bb', segmentation_id='2801', segmentation_type='vlan' | | | port_id='979cd48b-bc70-4bd3-b791-af0eb1c4b9b0', segmentation_id='2827', segmentation_type='vlan' | | | port_id='624c2e21-7b73-47c1-ace9-c63ace693233', segmentation_id='2858', segmentation_type='vlan' | | | port_id='5f7f6210-4c17-4f56-92ae-48999bd76186', segmentation_id='2898', segmentation_type='vlan' | | | port_id='9049af21-953d-46b1-823f-3c21d0c93867', segmentation_id='2922', segmentation_type='vlan' | | | port_id='72404cbb-664a-4312-a613-e7646d677aab', segmentation_id='2939', segmentation_type='vlan' | | | port_id='ae883d36-9e2f-489e-adeb-9faa2c25f84c', segmentation_id='2961', segmentation_type='vlan' | | | port_id='7ac68b95-7656-41a8-9221-ec32234c1c04', segmentation_id='3109', segmentation_type='vlan' | | | port_id='fc8a1779-5141-44aa-a35c-94f2f2c6f8ec', segmentation_id='3277', segmentation_type='vlan' | | | port_id='1755441f-1343-405e-9542-a0a9c70adb88', segmentation_id='3407', segmentation_type='vlan' | | | port_id='3c20dbdd-83f6-4d11-9e2e-3787b0ad792f', segmentation_id='3539', segmentation_type='vlan' | | | port_id='5ec685fa-a048-4944-9ec1-b6a39188fe2d', segmentation_id='3583', segmentation_type='vlan' | | | port_id='43b675ed-10a7-41a0-832c-27505612de46', segmentation_id='3611', segmentation_type='vlan' | | tags | [] | | tenant_id | 3210dadc4c0e41f1bf8dacd64753ee33 | | updated_at | 2021-06-29T11:48:22Z | +-----------------+--------------------------------------------------------------------------------------------------+ (shiftstack) [cloud-user@installer-host ~]$ openshift-install --log-level debug destroy cluster --dir ostest/ DEBUG OpenShift Installer 4.9.0-0.nightly-2021-06-28-221420 . . . INFO Time elapsed: 14m40s Martin -- Can you take a look at my proposed release note for this BZ? I saw your doc text and made a few changes to match our style. I want to make sure i didn't change the meaning. Thank you in advance. Michael * Previouslly, the Openstack network trunks did not contain a tag to identify it belongs to the cluster. As a consequence, cluster deletion misses the trun ports and gets stuck in a loop until the timeout. The cluster deletion now delete trunks for which the tagged port is a parent. (In reply to Michael Burke from comment #9) > Martin -- > > Can you take a look at my proposed release note for this BZ? I saw your doc > text and made a few changes to match our style. I want to make sure i didn't > change the meaning. Thank you in advance. > > Michael > > > * Previouslly, the Openstack network trunks did not contain a tag to > identify it belongs to the cluster. As a consequence, cluster deletion > misses the trun ports and gets stuck in a loop until the timeout. The > cluster deletion now delete trunks for which the tagged port is a parent. "In certain conditions" is more correct than "Previously" because we didn't change hot trunks are tagged, but now allow the installer to delete untagged trunks that clearly belong to the cluster. How about the following? In certain conditions, the Openstack network trunks does not contain a tag to identify it belongs to the cluster. As a consequence, cluster deletion previously missed the trunk ports and got stuck in a loop until the timeout. The cluster deletion now deletes trunks for which the tagged port is a parent. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |
The automated cleanup job running against MOC highlighted an issue with cluster deletion. Installer fails to identify trunk ports: level=debug msg=Exiting deleting openstack trunks level=debug msg=goroutine deleteTrunks complete Then neutron refused to delete port because it's the parent of a trunk port: level=debug msg=Deleting Port "f7fbec6a-14cb-4e3d-8e17-74e9a225dd9f" failed with error: Expected HTTP response code [] when accessing [DELETE https://kaizen.massopen.cloud:13696/v2.0/ports/f7fbec6a-14cb-4e3d-8e17-74e9a225dd9f], but got 409 instead level=debug msg={"NeutronError": {"message": "Port f7fbec6a-14cb-4e3d-8e17-74e9a225dd9f is currently a parent port for trunk 8c6fb4fa-e011-4326-8679-4716de9f9dd8.", "type": "PortInUseAsTrunkParent", "detail": ""}} level=debug msg=Exiting deleting openstack ports Full logs at: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-cleanup-moc/1404287722297757696/artifacts/cleanup-moc/shiftstack-cleanup/build-log.txt