Bug 1759986 - Races between retries and deletion actions
Summary: Races between retries and deletion actions
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.2.z
Assignee: Luis Tomas Bolivar
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On: 1759984
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-09 14:43 UTC by Luis Tomas Bolivar
Modified: 2019-11-13 18:55 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1759984
Environment:
Last Closed: 2019-11-13 18:55:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 71 0 'None' 'closed' '[release-4.2] Bug 1759986: Fix race conditions between handling ResourceNotReady and deletions' 2019-11-19 15:43:35 UTC
Red Hat Product Errata RHBA-2019:3303 0 None None None 2019-11-13 18:55:59 UTC

Description Luis Tomas Bolivar 2019-10-09 14:43:16 UTC
+++ This bug was initially created as a clone of Bug #1759984 +++

There are different races between retry actions (activation vif, getting vif for a pod, ...) and deletion actions. It may happen that some retry action gets postponed until the resource has already been deleted, leaving to kuryr-controller errors

Comment 2 Jon Uriarte 2019-10-25 14:46:15 UTC
Verified on OCP 4.2.0-0.nightly-2019-10-25-021846 build on top of OSP 13 2019-10-01.1 puddle.

release image: registry.svc.ci.openshift.org/ocp/release@sha256:8f97aa21e1c0b2815ec7c86e4138362940a5dcbc292840ab4d6d5b67fedb173f

Before this BZ was fixed these errors were shown in kuryr-controller logs when running openshift-tests:

· ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy VIFHandler: PortNotFoundClient: Port d3b2d608-19cd-4ef4-b726-b98119ef0cae could not be found.
· ERROR kuryr_kubernetes.handlers.logging NotFound: Subnet 039d7edf-3942-40cc-af46-0ed867e2a18c could not be found.
· ERROR kuryr_kubernetes.handlers.logging self._drv_vif_pool.delete_network_pools(net_crd['spec']['netId'])
  ERROR kuryr_kubernetes.handlers.logging TypeError: 'NoneType' object has no attribute '__getitem__'

After executing openshift/origin e2e kubernetes/conformance tests none of those messages were found, and kuryr-controller pod was not restarted due to them.

Comment 4 errata-xmlrpc 2019-11-13 18:55:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3303


Note You need to log in before you can comment on or make changes to this bug.