Bug 1873449
Summary: | [Kuryr] Cannot terminate namespaces due to error removing ports | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jon Uriarte <juriarte> | ||||||||
Component: | Networking | Assignee: | Luis Tomas Bolivar <ltomasbo> | ||||||||
Networking sub component: | kuryr | QA Contact: | GenadiC <gcheresh> | ||||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||||
Severity: | high | ||||||||||
Priority: | high | CC: | rlobillo, yinzhou | ||||||||
Version: | 4.5 | ||||||||||
Target Milestone: | --- | ||||||||||
Target Release: | 4.6.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2020-10-27 16:35:38 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1874840 | ||||||||||
Attachments: |
|
Description
Jon Uriarte
2020-08-28 11:42:18 UTC
Error on the kuryr controller looks like: 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool [-] Error removing the port fe7832c7-954e-454b-91c1-bc7a5fc57458: openstack.exceptions.ConflictException : ConflictException: 409: Client Error for url: https://10.46.22.24:13696/v2.0/ports/fe7832c7-954e-454b-91c1-bc7a5fc57458, Port fe7832c7-954e-454b-91c1-bc7a5fc57458 is currently a s ubport for trunk 1fb6802c-69d5-469a-95ee-9b157b0d608d. 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool Traceback (most recent call last): 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 89 8, in _precreated_ports 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool os_net.delete_port(port_id) 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/openstack/network/v2/_proxy.py", line 1749, in delete_por t 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool if_revision=if_revision) 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/openstack/proxy.py", line 46, in check 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool return method(self, expected, actual, *args, **kwargs) 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/openstack/network/v2/_proxy.py", line 75, in _delete 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool rv = res.delete(self, if_revision=if_revision) 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/openstack/resource.py", line 1622, in delete 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool self._translate_response(response, has_body=False, **kwargs) 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/openstack/resource.py", line 1113, in _translate_response 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool exceptions.raise_from_response(response, error_message=error_message) 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool File "/usr/local/lib/python3.6/site-packages/openstack/exceptions.py", line 235, in raise_from_respons e 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool http_status=http_status, request_id=request_id 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool openstack.exceptions.ConflictException: ConflictException: 409: Client Error for url: https://10.46.22.2 4:13696/v2.0/ports/fe7832c7-954e-454b-91c1-bc7a5fc57458, Port fe7832c7-954e-454b-91c1-bc7a5fc57458 is currently a subport for trunk 1fb6802c-69d5-469a-95ee-9b157b0d608d. 2020-08-28 14:19:55.415 1 ERROR kuryr_kubernetes.controller.drivers.vif_pool Problem is due to wrong tagging of worker node parent ports. This patch https://review.opendev.org/#/c/748670/ will help as it will ensure namespace are deleted anyway, but it won't solve the problem of wrong tagging which is the culprit and that breaks the proper ports pool functionality by not being able to re-discover the existing created ports Failed on 4.6.0-0.nightly-2020-09-03-063148 over RHOS-16.1-RHEL-8-20200821.n.0 (with ovn-octavia). After installing with IPI and running NP+Conformance, namespaces remained also hung on terminating state: $ oc get pods -n openshift-kuryr NAME READY STATUS RESTARTS AGE kuryr-cni-4f6bh 1/1 Running 6 4h33m kuryr-cni-6sqt7 1/1 Running 1 4h51m kuryr-cni-fpql4 1/1 Running 1 4h51m kuryr-cni-jxdf7 1/1 Running 0 4h51m kuryr-cni-ssw4q 1/1 Running 7 4h31m kuryr-cni-v7s7r 1/1 Running 7 4h32m kuryr-controller-846bff6c86-7qnhd 1/1 Running 21 4h51m $ oc get namespaces | grep Terminating e2e-configmap-5444 Terminating 118m e2e-dns-8169 Terminating 90m e2e-emptydir-3568 Terminating 116m e2e-gc-4183 Terminating 105m e2e-kubectl-19 Terminating 98m e2e-services-7416 Terminating 87m e2e-statefulset-6426 Terminating 90m e2e-webhook-82 Terminating 127m network-policy-487 Terminating 3h6m network-policy-7073 Terminating 3h18m $ openstack subnet list | grep e2e-dns-8169 | 614b29fb-8d0d-40a0-9f64-72d592c1d70d | ns/e2e-dns-8169-subnet | 94538656-a62f-456c-b53d-8fccf7aa6d8a | 10.128.156.0/23 | The port linked to that namespace is DOWN and device_owner empty: $ openstack port list | grep 614b29fb-8d0d-40a0-9f64-72d592c1d70d | 48b1bbc6-cd06-47e9-8128-03dd107dd568 | | fa:16:3e:6a:0b:38 | ip_address='10.128.156.55', subnet_id='614b29fb-8d0d-40a0-9f64-72d592c1d70d' | DOWN | $ openstack port show 48b1bbc6-cd06-47e9-8128-03dd107dd568 -f yaml admin_state_up: true allowed_address_pairs: [] binding_host_id: null binding_profile: null binding_vif_details: null binding_vif_type: null binding_vnic_type: normal created_at: '2020-09-03T13:40:10Z' data_plane_status: null description: '' device_id: '' device_owner: '' dns_assignment: - fqdn: host-10-128-156-55.shiftstack.com. hostname: host-10-128-156-55 ip_address: 10.128.156.55 dns_domain: '' dns_name: '' extra_dhcp_opts: [] fixed_ips: - ip_address: 10.128.156.55 subnet_id: 614b29fb-8d0d-40a0-9f64-72d592c1d70d id: 48b1bbc6-cd06-47e9-8128-03dd107dd568 location: cloud: '' project: domain_id: null domain_name: Default id: a429f89224cf4940a0be7ae306cbe53f name: shiftstack region_name: regionOne zone: null mac_address: fa:16:3e:6a:0b:38 name: '' network_id: 94538656-a62f-456c-b53d-8fccf7aa6d8a port_security_enabled: true project_id: a429f89224cf4940a0be7ae306cbe53f propagate_uplink_status: null qos_policy_id: null resource_request: null revision_number: 8 security_group_ids: - f9096ae0-1850-4f7f-96c1-78c6a48ffd77 status: DOWN tags: - openshiftClusterID=ostest-cbn5w trunk_details: null updated_at: '2020-09-03T13:45:01Z' So that the kuryr-controller is not able to delete it and loopcrashing. *** Bug 1876434 has been marked as a duplicate of this bug. *** Verified on 4.6.0-0.nightly-2020-09-05-015624 over RHOS-16.1-RHEL-8-20200831.n.1 with OVN-Octavia. After installing with IPI and running NP+Conformance, namespaces were successfully terminated: $ oc get namespaces | grep Terminating $ NP and conformance tests results were the expected ones: $ grep msg np_results/np_kubetest.log | grep PASSED | wc -l 23 $ grep ^passed conformance_results/conformance_ocp-tests.log | wc -l 289 Test logs attached. Created attachment 1713979 [details]
conformance test result
Created attachment 1713980 [details]
NP test results
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |