Hide Forgot
Description of problem: Egress IP managed by OVN-Kubernetes on OCP4.8.29. When EIP shifts to new node host, duplicate/stale entries remain in the NBDB table. Stale entries cause arp handling failures and prevent packet return from outbound connections - failing traffic flow. Purging stale entries alleviates issue temporarily. Purging the Database fully and resetting OVN masters/node pods alleviates issue for slightly longer, but also temporarily. Patch update applied to 4.8.29 to address, as per bugs: 2059354 2056050 However, the issue returned even after a full restore of the Databases following these two KCS articles: https://access.redhat.com/solutions/6664731 [see redhat internal commentary for steps to selectively purge EIP entries from nat table]. https://access.redhat.com/solutions/5118061 [see steps to wipe/reset OVN database - note that this also required the re-rollout of OVN-node/worker pods as well as master pods] Version-Release number of selected component (if applicable): OCP4.8.29 vsphere How reproducible: Every time - multiple clusters impacted Steps to Reproduce: 1. Update cluster to 4.8.29 - Vsphere 2. Create multiple Egress IP objects + configure namespace allocation + label nodes 3. Allow EIP to shift hosts via cordon/drain/reboot of host node 4. observe stale nat entries: ~~~ $ oc get eip egress-15255-bwa-ext-qa -oyaml |grep zqzzt -A4 -B1 - egressIP: 10.197.177.50 node: venus-rl4vp-worker-zqzzt - egressIP: 10.197.177.51 node: venus-rl4vp-worker-ext-wb626 # ovn-nbctl --format=csv find nat external_ids:name=egress-15255-bwa-ext-qa |egrep -v "zqzzt|wb626" _uuid,allowed_ext_ips,exempted_ext_ips,external_ids,external_ip,external_mac,external_port_range, logical_ip,logical_port,options,type 9b115000-d661-4d89-999c-0aceb13b68c6,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.50""",[], """""","""10.150.70.8""",k8s-venus-rl4vp-worker-ext-r6g9g,{},snat 82c0c001-d990-4823-bddd-71b7c5246283,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.51""",[], """""","""10.150.24.41""",k8s-venus-rl4vp-worker-ghdds,{},snat efc55002-27b2-41be-91f2-efd567052292,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.50""",[], """""","""10.150.48.8""",k8s-venus-rl4vp-worker-ext-r6g9g,{},snat 20875805-3a9f-42a4-84c9-4801c90d12e3,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.51""",[], """""","""10.150.40.50""",k8s-venus-rl4vp-worker-ghdds,{},snat ......... # ovn-nbctl --format=csv find nat external_ids:name=egress-15255-bwa-ext-qa |egrep -v "zqzzt|wb626" |wc -l 1115 ~~~ Actual results: - Egress IP fails to communicate with external services, as ARP tables redirect return packet to incorrect host node (previous EIP host), not currently actually hosting the EIP. Packets dropped on return trip. - OVN not performing cleanup tasking as expected Expected results: - OVN should perform cleanup tasking on stale entries/track more accurately EIP handling, Egress should perform as expected Additional info: - This BZ was spun up to address emergent issue that arose out of: https://bugzilla.redhat.com/show_bug.cgi?id=2056050 after it was declared resolved - opening new Bug to provide dedicated support space for issue. - That bug was considered to be related also to this bug: 2059354 - which may be addressing similar components but may not be directly related. Including here for context for back-end teams. ============= Seeking assistance with confirming issue is the same as the previously linked bugs and is a resurgance/failed patch - or if it's a NEW issue that appears to have the same ultimate impact. Customer has multiple clusters and linked case includes recent snapshots/uploads + OVN dataflows of this issue. There is a pending go-live that hinges on this being resolved as EIP is integral to service delivery chain.
Verified with pre-merged image built with ovn-kubernetes#1009 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.ci.test-2022-03-24-123436-ci-ln-966pd2b-latest True False 9m2s Cluster version is 4.8.0-0.ci.test-2022-03-24-123436-ci-ln-966pd2b-latest $ oc get node NAME STATUS ROLES AGE VERSION compute-0 Ready worker 24m v1.21.8+ee73ea2 compute-1 Ready worker 28m v1.21.8+ee73ea2 control-plane-0 Ready master 37m v1.21.8+ee73ea2 control-plane-1 Ready master 37m v1.21.8+ee73ea2 control-plane-2 Ready master 37m v1.21.8+ee73ea2$ oc get node -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME compute-0 Ready worker 51m v1.21.8+ee73ea2 172.31.248.48 172.31.248.48 Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa) 4.18.0-305.40.2.el8_4.x86_64 cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8 compute-1 Ready worker 55m v1.21.8+ee73ea2 172.31.248.51 172.31.248.51 Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa) 4.18.0-305.40.2.el8_4.x86_64 cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8 control-plane-0 Ready master 65m v1.21.8+ee73ea2 172.31.248.40 172.31.248.40 Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa) 4.18.0-305.40.2.el8_4.x86_64 cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8 control-plane-1 Ready master 65m v1.21.8+ee73ea2 172.31.248.50 172.31.248.50 Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa) 4.18.0-305.40.2.el8_4.x86_64 cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8 control-plane-2 Ready master 64m v1.21.8+ee73ea2 172.31.248.49 172.31.248.49 Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa) 4.18.0-305.40.2.el8_4.x86_64 cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8 $ oc label node compute-0 "k8s.ovn.org/egress-assignable"="" node/compute-0 labeled $ oc label node compute-1 "k8s.ovn.org/egress-assignable"="" node/compute-1 labeled $ cat config_egressip1_ovn_ns_team_red.yaml apiVersion: k8s.ovn.org/v1 kind: EgressIP metadata: name: egressip1 spec: egressIPs: - 172.31.248.101 - 172.31.248.102 - 172.31.248.103 namespaceSelector: matchLabels: team: red $ oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red.yaml egressip.k8s.ovn.org/egressip1 created $ oc get egressip -oyaml apiVersion: v1 items: - apiVersion: k8s.ovn.org/v1 kind: EgressIP metadata: creationTimestamp: "2022-03-24T14:36:08Z" generation: 2 name: egressip1 resourceVersion: "43763" uid: 41aa9fb4-0381-4fbe-99fb-1275540148ed spec: egressIPs: - 172.31.248.101 - 172.31.248.102 - 172.31.248.103 namespaceSelector: matchLabels: team: red podSelector: {} status: items: - egressIP: 172.31.248.101 node: compute-1 - egressIP: 172.31.248.102 node: compute-0 kind: List metadata: resourceVersion: "" selfLink: "" $ oc new-project test $ oc label ns test team=red $ oc create -f ./SDN-1332-test/list_for_pods.json $ oc get pod -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-rc-4jpbh 1/1 Running 0 97s 10.131.0.27 compute-1 <none> <none> test-rc-7rhcm 1/1 Running 0 97s 10.128.2.35 compute-0 <none> <none> test-rc-8cc2n 1/1 Running 0 97s 10.131.0.26 compute-1 <none> <none> test-rc-9cqds 1/1 Running 0 97s 10.128.2.34 compute-0 <none> <none> test-rc-m2fwv 1/1 Running 0 97s 10.128.2.36 compute-0 <none> <none> test-rc-nllv8 1/1 Running 0 97s 10.131.0.28 compute-1 <none> <none> test-rc-pcrpg 1/1 Running 0 97s 10.131.0.24 compute-1 <none> <none> test-rc-psfpw 1/1 Running 0 97s 10.128.2.32 compute-0 <none> <none> test-rc-qk4zl 1/1 Running 0 97s 10.128.2.33 compute-0 <none> <none> test-rc-sltzs 1/1 Running 0 97s 10.131.0.25 compute-1 <none> <none> $ oc rsh test-rc-4jpbh ~ $ while true; do curl 172.31.249.80:9095;sleep 2; echo ""; done; 172.31.248.101 172.31.248.102 172.31.248.101 172.31.248.101 172.31.248.101 172.31.248.102^C ~ $ exit command terminated with exit code 130 $ oc rsh test-rc-7rhcm ~ $ while true; do curl 172.31.249.80:9095;sleep 2; echo ""; done; 172.31.248.101 172.31.248.101 172.31.248.102 172.31.248.101 172.31.248.101 172.31.248.102 172.31.248.101^C ~ $ exit command terminated with exit code 130 $ oc get pod -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE ovnkube-master-2bgj9 6/6 Running 6 75m ovnkube-master-qqpt8 6/6 Running 6 75m ovnkube-master-s5s5s 6/6 Running 0 75m ovnkube-node-8jcnc 4/4 Running 0 75m ovnkube-node-hgjrd 4/4 Running 0 75m ovnkube-node-nfjqn 4/4 Running 0 75m ovnkube-node-qzj8g 4/4 Running 0 66m ovnkube-node-tks8s 4/4 Running 0 62m $ oc get -o jsonpath='{.metadata.annotations.control-plane\.alpha\.kubernetes\.io/leader}' -n openshift-ovn-kubernetes cm ovn-kubernetes-master {"holderIdentity":"control-plane-0","leaseDurationSeconds":60,"acquireTime":"2022-03-24T13:31:41Z","renewTime":"2022-03-24T14:41:38Z","leaderTransitions":0} $ oc get pod -n openshift-ovn-kubernetes -l app=ovnkube-master --field-selector=spec.nodeName=control-plane-0 -o jsonpath={.items[*].metadata.name} ovnkube-master-s5s5s $ oc -n openshift-ovn-kubernetes rsh ovnkube-master-s5s5s Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker sh-4.4# ovn-nbctl lr-policy-list ovn_cluster_router | grep "100 " 100 ip4.src == 10.128.2.32 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.33 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.34 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.35 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.36 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.24 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.25 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.26 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.27 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.28 reroute 100.64.0.5, 100.64.0.6 sh-4.4# $ oc debug node/jechen-0323d-4b5h8-compute-0 Starting pod/jechen-0323d-4b5h8-compute-0-debug ... To use host binaries, run `chroot /host` Pod IP: 172.31.248.18 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# reboot Removing debug pod ... $ oc -n openshift-ovn-kubernetes rsh ovnkube-master-s5s5s Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker sh-4.4# ovn-nbctl lr-policy-list ovn_cluster_router | grep "100 " 100 ip4.src == 10.128.2.32 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.33 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.34 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.35 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.128.2.36 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.24 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.25 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.26 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.27 reroute 100.64.0.5, 100.64.0.6 100 ip4.src == 10.131.0.28 reroute 100.64.0.5, 100.64.0.6 verified that no missing internal IP or duplicate record found sh-4.4# ovn-nbctl --format=csv find nat external_ids:name=egressip1 | egrep -v "compute-1|compute-0" _uuid,allowed_ext_ips,exempted_ext_ips,external_ids,external_ip,external_mac,external_port_range,logical_ip,logical_port,options,type no stale nat entry found
Verified: Tested
*** Bug 2059706 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.37 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1369