Bug 2062842 - OVN-Kubernetes - Stale Egress IP entries remain in NBDB after eip moves to new host, breaking arp. Persists after DB wipe. - 4.8.29
Summary: OVN-Kubernetes - Stale Egress IP entries remain in NBDB after eip moves to ne...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: 4.8.z
Assignee: ffernand
QA Contact: jechen
URL:
Whiteboard:
: 2059706 (view as bug list)
Depends On: 2059700
Blocks: 2056050
TreeView+ depends on / blocked
 
Reported: 2022-03-10 16:48 UTC by Will Russell
Modified: 2023-09-15 01:22 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-20 12:22:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1009 0 None Merged Bug 2062842: [4.8z] After reboot egress node, lr-policy-list was not correct, some duplicate records or missed internal ... 2022-04-08 20:27:56 UTC
Red Hat Product Errata RHBA-2022:1369 0 None None None 2022-04-20 12:22:32 UTC

Description Will Russell 2022-03-10 16:48:48 UTC
Description of problem:
Egress IP managed by OVN-Kubernetes on OCP4.8.29.
When EIP shifts to new node host, duplicate/stale entries remain in the NBDB table. 

Stale entries cause arp handling failures and prevent packet return from outbound connections - failing traffic flow.

Purging stale entries alleviates issue temporarily. Purging the Database fully and resetting OVN masters/node pods alleviates issue for slightly longer, but also temporarily.

Patch update applied to 4.8.29 to address, as per bugs:
2059354
2056050

However, the issue returned even after a full restore of the Databases following these two KCS articles:
https://access.redhat.com/solutions/6664731 [see redhat internal commentary for steps to selectively purge EIP entries from nat table].

https://access.redhat.com/solutions/5118061 [see steps to wipe/reset OVN database - note that this also required the re-rollout of OVN-node/worker pods as well as master pods]

Version-Release number of selected component (if applicable):

OCP4.8.29 vsphere

How reproducible:
Every time
- multiple clusters impacted

Steps to Reproduce:
1. Update cluster to 4.8.29 - Vsphere
2. Create multiple Egress IP objects + configure namespace allocation + label nodes
3. Allow EIP to shift hosts via cordon/drain/reboot of host node
4. observe stale nat entries:

~~~
$ oc get eip egress-15255-bwa-ext-qa -oyaml |grep zqzzt -A4 -B1
  - egressIP: 10.197.177.50
    node: venus-rl4vp-worker-zqzzt
  - egressIP: 10.197.177.51
    node: venus-rl4vp-worker-ext-wb626

# ovn-nbctl --format=csv find nat external_ids:name=egress-15255-bwa-ext-qa |egrep -v "zqzzt|wb626"

_uuid,allowed_ext_ips,exempted_ext_ips,external_ids,external_ip,external_mac,external_port_range,
logical_ip,logical_port,options,type
9b115000-d661-4d89-999c-0aceb13b68c6,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.50""",[],
"""""","""10.150.70.8""",k8s-venus-rl4vp-worker-ext-r6g9g,{},snat
82c0c001-d990-4823-bddd-71b7c5246283,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.51""",[],
"""""","""10.150.24.41""",k8s-venus-rl4vp-worker-ghdds,{},snat
efc55002-27b2-41be-91f2-efd567052292,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.50""",[],
"""""","""10.150.48.8""",k8s-venus-rl4vp-worker-ext-r6g9g,{},snat
20875805-3a9f-42a4-84c9-4801c90d12e3,[],[],{name=egress-15255-bwa-ext-qa},"""10.197.177.51""",[],
"""""","""10.150.40.50""",k8s-venus-rl4vp-worker-ghdds,{},snat
.........

# ovn-nbctl --format=csv find nat external_ids:name=egress-15255-bwa-ext-qa |egrep -v "zqzzt|wb626" |wc -l
1115
~~~

Actual results:
- Egress IP fails to communicate with external services, as ARP tables redirect return packet to incorrect host node (previous EIP host), not currently actually hosting the EIP. Packets dropped on return trip. 
- OVN not performing cleanup tasking as expected

Expected results:

- OVN should perform cleanup tasking on stale entries/track more accurately EIP handling, Egress should perform as expected

Additional info:

- This BZ was spun up to address emergent issue that arose out of:
https://bugzilla.redhat.com/show_bug.cgi?id=2056050 after it was declared resolved - opening new Bug to provide dedicated support space for issue.

- That bug was considered to be related also to this bug: 
2059354 - which may be addressing similar components but may not be directly related. Including here for context for back-end teams.


=============

Seeking assistance with confirming issue is the same as the previously linked bugs and is a resurgance/failed patch - or if it's a NEW issue that appears to have the same ultimate impact.

Customer has multiple clusters and linked case includes recent snapshots/uploads + OVN dataflows of this issue. There is a pending go-live that hinges on this being resolved as EIP is integral to service delivery chain.

Comment 10 jechen 2022-03-24 15:23:52 UTC
Verified with pre-merged image built with ovn-kubernetes#1009

$ oc get clusterversion
NAME      VERSION                                                  AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci.test-2022-03-24-123436-ci-ln-966pd2b-latest   True        False         9m2s    Cluster version is 4.8.0-0.ci.test-2022-03-24-123436-ci-ln-966pd2b-latest

$ oc get node
NAME              STATUS   ROLES    AGE   VERSION
compute-0         Ready    worker   24m   v1.21.8+ee73ea2
compute-1         Ready    worker   28m   v1.21.8+ee73ea2
control-plane-0   Ready    master   37m   v1.21.8+ee73ea2
control-plane-1   Ready    master   37m   v1.21.8+ee73ea2
control-plane-2   Ready    master   37m   v1.21.8+ee73ea2$ oc get node -owide
NAME              STATUS   ROLES    AGE   VERSION           INTERNAL-IP     EXTERNAL-IP     OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
compute-0         Ready    worker   51m   v1.21.8+ee73ea2   172.31.248.48   172.31.248.48   Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8
compute-1         Ready    worker   55m   v1.21.8+ee73ea2   172.31.248.51   172.31.248.51   Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8
control-plane-0   Ready    master   65m   v1.21.8+ee73ea2   172.31.248.40   172.31.248.40   Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8
control-plane-1   Ready    master   65m   v1.21.8+ee73ea2   172.31.248.50   172.31.248.50   Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8
control-plane-2   Ready    master   64m   v1.21.8+ee73ea2   172.31.248.49   172.31.248.49   Red Hat Enterprise Linux CoreOS 48.84.202203221810-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.21.6-2.rhaos4.8.gitb948fcd.el8


$ oc label node compute-0 "k8s.ovn.org/egress-assignable"=""
node/compute-0 labeled

$ oc label node compute-1 "k8s.ovn.org/egress-assignable"=""
node/compute-1 labeled

$ cat config_egressip1_ovn_ns_team_red.yaml
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
  name: egressip1
spec:
  egressIPs:
  - 172.31.248.101
  - 172.31.248.102
  - 172.31.248.103
  namespaceSelector:
    matchLabels:
      team: red 


$  oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red.yaml
egressip.k8s.ovn.org/egressip1 created


$ oc get egressip -oyaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
  kind: EgressIP
  metadata:
    creationTimestamp: "2022-03-24T14:36:08Z"
    generation: 2
    name: egressip1
    resourceVersion: "43763"
    uid: 41aa9fb4-0381-4fbe-99fb-1275540148ed
  spec:
    egressIPs:
    - 172.31.248.101
    - 172.31.248.102
    - 172.31.248.103
    namespaceSelector:
      matchLabels:
        team: red
    podSelector: {}
  status:
    items:
    - egressIP: 172.31.248.101
      node: compute-1
    - egressIP: 172.31.248.102
      node: compute-0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""


$ oc new-project test
$ oc label ns test team=red
$ oc create -f ./SDN-1332-test/list_for_pods.json


$ oc get pod -owide
NAME            READY   STATUS    RESTARTS   AGE   IP            NODE        NOMINATED NODE   READINESS GATES
test-rc-4jpbh   1/1     Running   0          97s   10.131.0.27   compute-1   <none>           <none>
test-rc-7rhcm   1/1     Running   0          97s   10.128.2.35   compute-0   <none>           <none>
test-rc-8cc2n   1/1     Running   0          97s   10.131.0.26   compute-1   <none>           <none>
test-rc-9cqds   1/1     Running   0          97s   10.128.2.34   compute-0   <none>           <none>
test-rc-m2fwv   1/1     Running   0          97s   10.128.2.36   compute-0   <none>           <none>
test-rc-nllv8   1/1     Running   0          97s   10.131.0.28   compute-1   <none>           <none>
test-rc-pcrpg   1/1     Running   0          97s   10.131.0.24   compute-1   <none>           <none>
test-rc-psfpw   1/1     Running   0          97s   10.128.2.32   compute-0   <none>           <none>
test-rc-qk4zl   1/1     Running   0          97s   10.128.2.33   compute-0   <none>           <none>
test-rc-sltzs   1/1     Running   0          97s   10.131.0.25   compute-1   <none>           <none>

$ oc rsh test-rc-4jpbh
~ $ while true; do curl 172.31.249.80:9095;sleep 2; echo ""; done;
172.31.248.101
172.31.248.102
172.31.248.101
172.31.248.101
172.31.248.101
172.31.248.102^C
~ $ exit
command terminated with exit code 130

$ oc rsh test-rc-7rhcm
~ $ while true; do curl 172.31.249.80:9095;sleep 2; echo ""; done;
172.31.248.101
172.31.248.101
172.31.248.102
172.31.248.101
172.31.248.101
172.31.248.102
172.31.248.101^C
~ $ exit
command terminated with exit code 130


$ oc get pod -n openshift-ovn-kubernetes 
NAME                   READY   STATUS    RESTARTS   AGE
ovnkube-master-2bgj9   6/6     Running   6          75m
ovnkube-master-qqpt8   6/6     Running   6          75m
ovnkube-master-s5s5s   6/6     Running   0          75m
ovnkube-node-8jcnc     4/4     Running   0          75m
ovnkube-node-hgjrd     4/4     Running   0          75m
ovnkube-node-nfjqn     4/4     Running   0          75m
ovnkube-node-qzj8g     4/4     Running   0          66m
ovnkube-node-tks8s     4/4     Running   0          62m


$ oc get -o jsonpath='{.metadata.annotations.control-plane\.alpha\.kubernetes\.io/leader}'  -n openshift-ovn-kubernetes  cm ovn-kubernetes-master
{"holderIdentity":"control-plane-0","leaseDurationSeconds":60,"acquireTime":"2022-03-24T13:31:41Z","renewTime":"2022-03-24T14:41:38Z","leaderTransitions":0}

$ oc get pod -n openshift-ovn-kubernetes -l app=ovnkube-master --field-selector=spec.nodeName=control-plane-0 -o jsonpath={.items[*].metadata.name}
ovnkube-master-s5s5s


$ oc -n openshift-ovn-kubernetes rsh ovnkube-master-s5s5s
Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker
sh-4.4# ovn-nbctl lr-policy-list ovn_cluster_router  | grep "100 "
       100                             ip4.src == 10.128.2.32         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.33         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.34         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.35         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.36         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.24         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.25         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.26         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.27         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.28         reroute                100.64.0.5, 100.64.0.6
sh-4.4# 


$ oc debug node/jechen-0323d-4b5h8-compute-0
Starting pod/jechen-0323d-4b5h8-compute-0-debug ...
To use host binaries, run `chroot /host`
Pod IP: 172.31.248.18
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# reboot

Removing debug pod ...


$ oc -n openshift-ovn-kubernetes rsh ovnkube-master-s5s5s
Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker
sh-4.4# ovn-nbctl lr-policy-list ovn_cluster_router  | grep "100 "
       100                             ip4.src == 10.128.2.32         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.33         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.34         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.35         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.36         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.24         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.25         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.26         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.27         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.28         reroute                100.64.0.5, 100.64.0.6

verified that no missing internal IP or duplicate record found


sh-4.4#  ovn-nbctl --format=csv find nat external_ids:name=egressip1 | egrep -v "compute-1|compute-0"
_uuid,allowed_ext_ips,exempted_ext_ips,external_ids,external_ip,external_mac,external_port_range,logical_ip,logical_port,options,type

no stale nat entry found

Comment 11 jechen 2022-03-24 15:25:26 UTC
Verified: Tested

Comment 14 ffernand 2022-04-08 18:57:59 UTC
*** Bug 2059706 has been marked as a duplicate of this bug. ***

Comment 19 errata-xmlrpc 2022-04-20 12:22:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.37 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1369

Comment 20 Red Hat Bugzilla 2023-09-15 01:22:43 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.