Bug 1877273
Summary: | [OVN] EgressIP cannot fail over to available nodes after one egressIP node shutdown | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | huirwang | |
Component: | Networking | Assignee: | Alexander Constantinescu <aconstan> | |
Networking sub component: | ovn-kubernetes | QA Contact: | huirwang | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | aconstan, acossett, amulmule, bbennett, ChetRHosey, danw, jboxman, jnordell, skanakal, vpickard | |
Version: | 4.6 | |||
Target Milestone: | --- | |||
Target Release: | 4.7.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause:
When a node experienced networking issues (or the kubelet failed to function properly and went into a non-ready state) the egress IPs assigned to that node were never re-assigned elsewhere
Consequence:
The egress IP functionality was broken as packets were still routed to this faulty egress node, which could not serve traffic.
Fix:
We now verify the state of all egress nodes periodically by pinging each egress node and verifying the node object's state.
Result:
In case a node goes down, the egress IPs are now re-assigned and the functionality keeps working by re-directing egress traffic to another node.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1898160 (view as bug list) | Environment: | ||
Last Closed: | 2021-02-24 15:17:43 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1898160 |
Description
huirwang
2020-09-09 09:31:43 UTC
Hi Huiran I am going to push this out to 4.7, the reason is: 1) In OVN/OVS we cannot have multiple reroutes matching the same traffic to multiple egress nodes. For this we would need the OVN RFE: https://bugzilla.redhat.com/show_bug.cgi?id=1881826 to be implemented. 2) Even if multiple reroutes to multiple egress nodes exists, we cannot ensure that if a node silently dies (i.e the OpenShift/Kubernetes API server is not aware) that traffic then flows though the node which is still functioning, for that we will need OVN RFE: https://bugzilla.redhat.com/show_bug.cgi?id=1847570 This is not a big use case I believe and should be fine waiting for until the 4.7 release. This is really important to fix as soon as possible, since the customer can not go to production without a failover working scenario, the application will be down until this faulty node is destroy. Also it does not make sens to configure 2 IP, if the failover is not working. This is a bid use case for Telco and Financial customer. It looks like we sort of covered this with: "If a node is deleted by a cluster administrator, any egress IP addresses assigned to it are automatically reassigned, subject to the previously described conditions." But it isn't working? Is this a known issue for OCP 4.6 GA? Thanks! (In reply to Alexander Constantinescu from comment #5) > This is not a big use case I believe and should be fine waiting for until > the 4.7 release. Doh. So I guess there was confusion in all the scurrying to finish 4.6 features, but this is absolutely a mandatory part of the feature. OVN-Kubernetes needs to actively detect when nodes become unreachable, and move their egress IPs away when they do. It can't just assume Nodes will get deleted if they are unavailable. See poll()/check() in https://github.com/openshift/sdn/blob/master/pkg/network/master/egressip.go for the openshift-sdn version. OVN-Kubernetes also needs code to rebalance egress IPs when they get too unbalanced. (eg, once the above problem is fixed, then after an upgrade, the last egress node to reboot would end up with 0 egress IPs assigned afterward). What OpenShift SDN does (ReallocateEgressIPs() in https://github.com/openshift/sdn/blob/master/pkg/network/common/egressip.go) is that every time a node or an egress IP is added or removed, it computes both an "incremental" allocation (like what ovn-kubernetes does now) and a "from scratch" allocation (ie, how it would have chosen to allocate the IPs if none of them were already assigned). And then if any node has more than twice as many egress IPs in the "from scratch" allocation as it would have had in the "incremental" allocation, it knows things have gotten unbalanced and it needs to proactively move some IPs over to the underallocated node(s). This is ready for testing. It's been integrated on master (i.e 4.7 with PR: https://github.com/openshift/ovn-kubernetes/pull/317) so I am setting it to MODIFIED. I am working on the back-port to 4.6 So does this need a docs update once it is merged? Thanks! Docs update for this BZ: https://github.com/openshift/openshift-docs/pull/28956 Is this okay? Thanks! With the lastest code change, why does BOTH IP are reassigned to new nodes, when only on fails ? Initial working flow Active Traffic flow from the egress pod attached to this egressIP POD to External is natted to 10.0.32.112(all good working) (view starting config) ----- Step for failover is to shutdown the node ovn-qgwkn-worker-canadacentral3-k4qk5 = 10.0.32.112 and expect the traffic to than be exited with 10.0.32.111 (as per initial starting config) Results : IP 111 is re-assigned to another node automatically and traffic flow is interrupted, and you can see now that both IP are not matching the right node anymore... Result Config after the node shutdown ~/Documents/ocp4/ovn_egressip » oc get egressIP NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS egressip-test 10.0.32.111 ovn-qgwkn-worker-canadacentral2-rhswn 10.0.32.111 status: items: - egressIP: 10.0.32.111 node: ovn-qgwkn-worker-canadacentral2-rhswn - egressIP: 10.0.32.112 node: ovn-qgwkn-worker-canadacentral1-bskbf -------------------------------------------------------------------- Starting Config ~/Documents/ocp4/ovn_egressip » oc get egressIP NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS egressip-test 10.0.32.111 ovn-qgwkn-worker-canadacentral1-bskbf 10.0.32.111 Node #1: ovn-qgwkn-worker-canadacentral1-bskbf = 10.0.32.111 Node #2: ovn-qgwkn-worker-canadacentral3-k4qk5 = 10.0.32.112 egressIP yaml : apiVersion: k8s.ovn.org/v1 kind: EgressIP metadata: name: egressip-test spec: egressIPs: - 10.0.32.111 - 10.0.32.112 namespaceSelector: matchLabels: name: example-egressip1 status: items: - egressIP: 10.0.32.111 node: ovn-qgwkn-worker-canadacentral1-bskbf - egressIP: 10.0.32.112 node: ovn-qgwkn-worker-canadacentral3-k4qk5 Expectation (manual mode) #1 10.0.32.111 should not be reassigned (untouched) and Pod traffic should start exiting with this IP. #2 10.0.32.112 should be inactive until the node comes back or reassigned after +- 5 minutes of inactivities ? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |