Description of problem: The EgressIP code throws an error and retries the delete operation when the op cache has a delete operation, but the object was not found in the API: ~~~ I0709 20:34:01.801608 1 obj_retry.go:1103] *v1.EgressIP retry update failed for egressip, will try again later: cloud deletion request failed for CloudPrivateIPConfig: 10.0.129.20, could not get item, err: cloudprivateipconfig.cloud.network.openshift.io "10.0.129.20" not found ~~~ That IMO makes no sense becaue the desired state (deleted) already == the isState (object not found). The operation is definitely not idempotent in its current form. So why would we retry a deletion, here? I ran into this a few times while working on the EgressIP and deploying different ovnkube images. But I don't know if that hits us in production of not. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
When that happened, I had to manually delete egressips and the cloudprivateipconfig object. This happened when an EgressIP object existed, I changed the ovnkube-master image, and then edited the egressip object and changed it's IP address.
This will also cause that any other operations in the ops map are not executed, because there's a risk to stall on this delete operation.
This happens in the context of the changes required for 2105706, before it didn't surface, so I'm closing this as a duplicate of that bug. *** This bug has been marked as a duplicate of bug 2105706 ***