Bug 1753216
Summary: | Duplicated IPs on InfraNodes using automatic egress IP | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Bruno Lima <blima> | |
Component: | Networking | Assignee: | Dan Winship <danw> | |
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | unspecified | CC: | aghadge, aivaras.laimikis, bbennett, danw, huirwang, jnordell, natanfranghieru, palonsor, piqin, swasthan | |
Version: | 3.11.0 | |||
Target Milestone: | --- | |||
Target Release: | 4.3.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: When the SDN pod was restarted on a node, it did not clean up any old Egress IPs.
Consequence: If the set of Egress IPs assigned to a node changed while the SDN pod was not running (eg, because multiple services on the node were restarted at the same time) then the node might continue to claim that it owned the Egress IP even after the IP had been assigned to another node, causing traffic to that IP to be delivered to the wrong node and be lost.
Fix: The SDN pod now cleans up stale Egress IPs at startup.
Result: Nodes should not fight over ownership of Egress IPs.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1762235 1772904 1772905 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-23 11:06:16 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1762235, 1772904, 1772905 |
Description
Bruno Lima
2019-09-18 12:30:38 UTC
I have the same problem, here with more details: We have 3 infra nodes, they have the egress CIDRs: NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS infra01 infra01 192.168.1.247 10.144.16.0/23 [192.168.1.0/24] [192.168.1.19, 192.168.1.25, 192.168.1.22, 192.168.1.12, 192.168.1.14, 192.168.1.24, 192.168.1.21] infra02 infra02 192.168.1.246 10.144.14.0/23 [192.168.1.0/24] [192.168.1.15, 192.168.1.29, 192.168.1.11, 192.168.1.30, 192.168.1.27, 192.168.1.17] infra03 infra03 192.168.1.245 10.144.10.0/23 [192.168.1.0/24] [192.168.1.28, 192.168.1.13, 192.168.1.31, 192.168.1.16, 192.168.1.20, 192.168.1.18] We are assigning one egress IP per project: NAME NETID EGRESS IPS project01 omitted [192.168.1.17] project02 omitted [192.168.1.28] project03 omitted [192.168.1.20] project04 omitted [192.168.1.29] project05 omitted [192.168.1.27] project06 omitted [192.168.1.16] project07 omitted [192.168.1.25] project08 omitted [192.168.1.21] project09 omitted [192.168.1.14] project10 omitted [192.168.1.13] project11 omitted [192.168.1.31] project12 omitted [192.168.1.19] project13 omitted [192.168.1.15] project14 omitted [192.168.1.18] project15 omitted [192.168.1.30] project16 omitted [192.168.1.12] project17 omitted [192.168.1.11] project18 omitted [192.168.1.24] project19 omitted [192.168.1.22] On the infra nodes, the egress IP are showing in the interface as secondary: infra01: inet 192.168.1.14/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.12/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.25/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.19/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.24/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.22/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.21/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever infra02: inet 192.168.1.15/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.27/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.11/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.30/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.29/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.17/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever infra03: inet 192.168.1.13/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.28/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.20/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.31/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.16/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.18/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever Some events (we don't know exactly which ones, but one of them is docker restart) triggers the Egress IP engine to change the IP from node, so, for example, for some reason the docker service had a problem in infra03, the egress IPs that are on infra03 are migrated to infra01 and infra 02: NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS infra01 infra01 192.168.1.247 10.144.16.0/23 [192.168.1.0/24] [192.168.1.19, 192.168.1.25, 192.168.1.22, 192.168.1.12, 192.168.1.14, 192.168.1.24, 192.168.1.21, 192.168.1.28, 192.168.1.31, 192.168.1.20] infra02 infra02 192.168.1.246 10.144.14.0/23 [192.168.1.0/24] [192.168.1.15, 192.168.1.29, 192.168.1.11, 192.168.1.30, 192.168.1.27, 192.168.1.17, 192.168.1.13, 192.168.1.16, 192.168.1.18] infra03 infra03 192.168.1.245 10.144.10.0/23 [192.168.1.0/24] [] The IPs are allocated on interfaces too: infra01: inet 192.168.1.14/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.12/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.25/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.19/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.24/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.22/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.21/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.28/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.20/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.31/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever infra02: inet 192.168.1.15/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.27/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.11/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.30/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.29/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.17/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.13/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.16/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.18/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever And the bug is here, the infra03 IPs aren't removed from the interface, causing duplicated IP on network, this causes outages, because sometimes the traffic goes to the right node, and sometimes to the wrong node. infra03: inet 192.168.1.13/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.28/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.20/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.31/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.16/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever inet 192.168.1.18/24 brd 192.168.1.255 scope global secondary mgmt valid_lft forever preferred_lft forever The expected behavior is that all the secondary IPs that were migrated should be deleted from the interface. OpenShift Master: v3.11.117 Kubernetes Master: v1.11.0+d4cacc0 OpenShift Web Console: v3.11.117 One more missing detail, this bug happens in 3.11.141 too, as we tested on our laboratory. Hm... so I guess the problem is that the docker restart on infra03 also causes openshift-sdn to restart, and when it comes back up, the egress IPs have already been removed from the HostSubnet object, so it doesn't explicitly get told to remove them, and it doesn't actually even know that they are former egress IPs. It shouldn't be hard to fix, but there's no easy workaround until we do. They'll need to just ensure that the old egress IPs get cleaned up manually when this happens. QE: to reproduce/test: - Create a namespace+node with an egress IP, as in other tests. (Use a manually-assigned egress IP, it will be simpler.) Confirm that it works. - On the node with the egress IP, kill the sdn pod and prevent it from being restarted. Not sure what the best way to do that is... - Remove the egress IP from the HostSubnet it's on, add it to a different HostSubnet. Confirm that "ip addr" on both nodes shows the egress IP. - Allow a new sdn pod to be started on the old egress node. - Expected behavior: "ip addr" on the old egress node shows that the egress IP has now been removed. (Current/buggy behavior: the egress IP still exists after restart.) (In reply to Dan Winship from comment #3) > Hm... so I guess the problem is that the docker restart on infra03 also > causes openshift-sdn to restart, and when it comes back up, the egress IPs > have already been removed from the HostSubnet object, so it doesn't > explicitly get told to remove them, and it doesn't actually even know that > they are former egress IPs. > > It shouldn't be hard to fix, but there's no easy workaround until we do. > They'll need to just ensure that the old egress IPs get cleaned up manually > when this happens. Thanks! We've already made a Python script that runs every 5 min and compares the hostsubnet versus the ip addr of every infra node, if there are more IP addresses than egress IPs, the script deletes the leftovers IPs from the corresponding infra. It doesn't prevent the error, but at least we suffer for a maximum of 5 minutes. It not happens a lot, but when it happens it's a big impact because the pods can't access nothing outside the project. (In reply to Dan Winship from comment #4) > QE: to reproduce/test: additional test: - If you restart openshift-sdn *without* reassigning the egress IPs, it doesn't remove them on startup. (Check the logs to make sure of this; if it removes the IP but then adds it back, that counts as failure. It shouldn't remove it in the first place.) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |