Bug 1899922 - NP changes sometimes influence new pods.
Summary: NP changes sometimes influence new pods.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: rdobosz
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks: 1900562
TreeView+ depends on / blocked
 
Reported: 2020-11-20 11:42 UTC by rdobosz
Modified: 2021-02-24 15:35 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:35:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
tempest results with the fix on osp16.1 (16.67 MB, application/gzip)
2020-11-26 13:08 UTC, rlobillo
no flags Details
tempest results with the fix on osp13 (16.53 MB, application/gzip)
2020-11-26 13:10 UTC, rlobillo
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 410 0 None closed Bug 1899922: NP changes sometimes influence new pods/services. 2021-01-06 16:08:40 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:35:31 UTC

Description rdobosz 2020-11-20 11:42:25 UTC
Description of problem:

During tempest test run, two issues manifests itself, both of them connected with creation/deletion of network policy.

First one have a traceback:

2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy KuryrPortHandler: openstack.exceptions.ResourceNotFound: ResourceNotFound: 404: Client Error for url: https://10.0.111.27:9696/v2.0/ports, Security group 7fd14a93-8bff-4c41-9588-dd2d7279fbbf does not exist
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/handlers/retry.py", line 81, in __call__
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     self._handler(event, *args, **kwargs)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/handlers/k8s_base.py", line 90, in __call__
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     self.on_present(obj)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/controller/handlers/kuryrport.py", line 63, in on_present
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     if not self.get_vifs(kuryrport_crd):
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/controller/handlers/kuryrport.py", line 225, in get_vifs
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     pod, project_id, subnets, security_groups)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1213, in request_vif
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     pod, project_id, subnets, security_groups)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/controller/drivers/vif_pool.py", line 116, in request_vif
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     security_groups)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/controller/drivers/neutron_vif.py", line 39, in request_vif
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     port = os_net.create_port(**rq)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/usr/local/lib/python3.6/dist-packages/openstack/network/v2/_proxy.py", line 1719, in create_port
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     return self._create(_port.Port, **attrs)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/usr/local/lib/python3.6/dist-packages/openstack/proxy.py", line 459, in _create
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     return res.create(self, base_path=base_path)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/usr/local/lib/python3.6/dist-packages/openstack/resource.py", line 1298, in create
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     self._translate_response(response, has_body=has_body)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/usr/local/lib/python3.6/dist-packages/openstack/resource.py", line 1113, in _translate_response
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     exceptions.raise_from_response(response, error_message=error_message)
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry   File "/usr/local/lib/python3.6/dist-packages/openstack/exceptions.py", line 235, in raise_from_response
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry     http_status=http_status, request_id=request_id
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry openstack.exceptions.ResourceNotFound: ResourceNotFound: 404: Client Error for url: https://10.0.111.27:9696/v2.0/ports, Security group 7fd14a93-8bff-4c41-9588-dd2d7279fbbf does not exist
2020-11-04 13:26:16.873 19857 ERROR kuryr_kubernetes.handlers.retry
2020-11-04 13:26:16.904 19857 ERROR kuryr_kubernetes.handlers.logging [-] Failed to handle event {'type': 'ADDED', 'object': {'apiVersion': 'openstack.org/v1', 'kind': 'KuryrPort', 'metadata': {'creationTimestamp': '2020-11-04T13:26:15Z', 'finalizers': ['kuryr.openstack.org/kuryrport-finalizer'], 'generation': 1, 'labels': {'kuryr.openstack.org/nodeName': 'rdobosz-devstack'}, 'managedFields': [{'apiVersion': 'openstack.org/v1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:finalizers': {'.': {}, 'v:"kuryr.openstack.org/kuryrport-finalizer"': {}}, 'f:labels': {'.': {}, 'f:kuryr.openstack.org/nodeName': {}}}, 'f:spec':
{'.': {}, 'f:podNodeName': {}, 'f:podUid': {}}, 'f:status': {'.': {}, 'f:vifs': {}}}, 'manager': 'python-requests', 'operation': 'Update', 'time': '2020-11-04T13:26:15Z'}], 'name': 'kuryr-pod-840644914', 'namespace': 'default', 'resourceVersion': '42725', 'selfLink': '/apis/openstack.org/v1/namespaces/default/kuryrports/kuryr-pod-840644914', 'uid': 'aaf599aa-6e80-49fd-af44-708d0b699ac5'}, 'spec': {'podNodeName': 'rdobosz-devstack', 'podUid': 'ccbb1fbc-0512-4b4d-8784-c1b708321d63'}, 'status': {'vifs': {}}}}: openstack.exceptions.ResourceNotFound: ResourceNotFound: 404: Client Error for url: https://10.0.111.27:9696/v2.0/ports, Security group 7fd14a93-8bff-4c41-9588-dd2d7279fbbf does not exist

where kuryrport creation was failed due to the fact, that between gathering all the information about SG/subnet etc, and requesting VIF, NP was removed so that security groups doesn't exists anymore.

Second issue:

2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event, *args, **kwargs)
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 81, in __call__
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event, *args, **kwargs)
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 90, in __call__
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging     self.on_present(obj)
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/kuryrnetworkpolicy.py", line 227, in on_present
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging     self._drv_lbaas.update_lbaas_sg(service, sgs)
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 865, in update_lbaas_sg
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging     sg_rule_name, listener_id, sgs)
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 230, in _apply_members_security_groups
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging     lb_sg = vip_port.security_group_ids[0]
2020-10-30 12:51:47.966 1 ERROR kuryr_kubernetes.handlers.logging IndexError: list index out of range

has similar root cause - NP has gone during applying it on members.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

It is hard to reproduce, due to the timings, but after fix release, it shouldnt be visible in logs after installation run couple of times.

Comment 2 rlobillo 2020-11-26 13:06:39 UTC
Verified on OCP4.7.0-0.nightly-2020-11-25-015010 over OSP13 with Amphoras (2020-11-13.1) and OSP16.1 with OVN-Octavia (RHOS-16.1-RHEL-8-20201110.n.1)

Run tempest 4 times on both OSP setups. The results are the expected and no restarts are observed on kuryr-controller.

Logs attached.

Comment 3 rlobillo 2020-11-26 13:08:54 UTC
Created attachment 1733721 [details]
tempest results with the fix on osp16.1

Comment 4 rlobillo 2020-11-26 13:10:48 UTC
Created attachment 1733739 [details]
tempest results with the fix on osp13

Comment 7 errata-xmlrpc 2021-02-24 15:35:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.