Bug 1871856 - ovn-kubernetes egress-ip controller doesn't seem to handle ovn losses
Summary: ovn-kubernetes egress-ip controller doesn't seem to handle ovn losses
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.6.0
Assignee: Alexander Constantinescu
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-24 13:10 UTC by Casey Callendrello
Modified: 2020-09-09 10:02 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-09 10:02:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Casey Callendrello 2020-08-24 13:10:13 UTC
All ovn-kubernetes control loops need to be able to handle the loss of the ovn northd. This means their reconciliation loop cannot short-circuit if, for example, annotations exist on the kubernetes object.

The EgressIP controller contains this comment:

> If the status is set at this point, then we know it's valid from syncEgressIP and we have no assignment to do.
> Just initialize all watchers (which should not re-create any already existing items in the OVN DB)

which implies that it does not correctly handle the loss of nbdb.

Please verify the code and, if necessary, ensure that nbdb entries are recreated as necessary.

Comment 1 Ricardo Carrillo Cruz 2020-08-24 14:11:07 UTC
Assigning to you Casey, if you want someone else to work it just give me a ping on Slack.

Comment 2 Alexander Constantinescu 2020-08-26 13:25:30 UTC
I don't follow on how any of that is a bug? Has it been observed? 

The comment cited:

> If the status is set at this point, then we know it's valid from syncEgressIP and we have no assignment to do.
> Just initialize all watchers (which should not re-create any already existing items in the OVN DB)

refers to the fact that we should short-circuit `assignEgressIPs` if the status is already valid when we reboot (as to avoid useless re-assignment). The nbdb re-creation has nothing to do with any of that. The database entities will be created/re-created when the egress IP matches a pod / namespace.

To be clear, that comment only affects the assignment procedure, it makes no reference to the DB creation/re-creation (which should work as it does for all other API resources).

Comment 3 Alexander Constantinescu 2020-08-26 13:36:49 UTC
Ahh, now I follow on where the confusion comes from. 

EgressIP has not been enabled for OCP yet. Any EgressIP creation on OCP has no impact on OVN yet, I will enable this the coming week. I've been holding off waiting for a new OVN version which was critical for egress IP to work. This has been done with PR: https://github.com/ovn-org/ovn-kubernetes/pull/1616 so now PR: https://github.com/ovn-org/ovn-kubernetes/pull/1567 should bring in the proper stuff. Once that is rebased downstream, egress IP will be enabled.

Comment 6 Alexander Constantinescu 2020-09-09 10:02:52 UTC
Closing this as this was clearly incorrectly created and should not be tested by QE.


Note You need to log in before you can comment on or make changes to this bug.