Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2105712

Summary:	EgressIP with CNCC: Operator retries deletion of a not found object
Product:	OpenShift Container Platform	Reporter:	Andreas Karis <akaris>
Component:	Networking	Assignee:	Andreas Karis <akaris>
Networking sub component:	ovn-kubernetes	QA Contact:	Anurag saxena <anusaxen>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	low
Priority:	low
Version:	4.12
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-07-10 16:35:30 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Andreas Karis 2022-07-09 20:58:09 UTC

Description of problem:

The EgressIP code throws an error and retries the delete operation when the op cache has a delete operation, but 
the object was not found in the API:
~~~
I0709 20:34:01.801608       1 obj_retry.go:1103] *v1.EgressIP retry update failed for egressip, will try again later: cloud deletion request failed for CloudPrivateIPConfig: 10.0.129.20, could not get item, err: cloudprivateipconfig.cloud.network.openshift.io "10.0.129.20" not found
~~~

That IMO makes no sense becaue the desired state (deleted) already == the isState (object not found). The operation is definitely not idempotent in its current form.

So why would we retry a deletion, here?

I ran into this a few times while working on the EgressIP and deploying different ovnkube images. 

But I don't know if that hits us in production of not.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Andreas Karis 2022-07-09 20:59:24 UTC

When that happened, I had to manually delete egressips and the cloudprivateipconfig object. This happened when an EgressIP object existed, I changed the ovnkube-master image, and then edited the egressip object and changed it's IP address.

Comment 2 Andreas Karis 2022-07-09 21:55:58 UTC

This will also cause that any other operations in the ops map are not executed, because there's a risk to stall on this delete operation.

Comment 3 Andreas Karis 2022-07-10 16:35:30 UTC

This happens in the context of the changes required for 2105706, before it didn't surface, so I'm closing this as a duplicate of that bug.

*** This bug has been marked as a duplicate of bug 2105706 ***