Bug 2059354 - [OVN]After reboot egress node, lr-policy-list was not correct, some duplicate records or missed internal IPs
Summary: [OVN]After reboot egress node, lr-policy-list was not correct, some duplicat...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 4.10.z
Assignee: ffernand
QA Contact: jechen
URL:
Whiteboard:
Depends On: 2063321
Blocks: 2059700
TreeView+ depends on / blocked
 
Reported: 2022-02-28 21:10 UTC by ffernand
Modified: 2022-04-08 05:04 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2056050
: 2059700 (view as bug list)
Environment:
Last Closed: 2022-04-08 05:04:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 994 0 None open Bug 2059354 [4.10]: TRIVIAL Enable info logging for successful assignment of egress IP 2022-03-11 19:00:22 UTC
Red Hat Product Errata RHSA-2022:1162 0 None None None 2022-04-08 05:04:39 UTC

Comment 4 jechen 2022-03-24 02:38:09 UTC
verfied with pre-merged image built with ovn-kubernetes#994

$ oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci.test-2022-03-23-211848-ci-ln-14dlrf2-latest   True        False         3h22m   Cluster version is 4.10.0-0.ci.test-2022-03-23-211848-ci-ln-14dlrf2-latest

$ oc get node
NAME                                 STATUS   ROLES    AGE     VERSION
jechen-0323d-4b5h8-compute-0         Ready    worker   3h31m   v1.23.5+b0357ed
jechen-0323d-4b5h8-compute-1         Ready    worker   3h32m   v1.23.5+b0357ed
jechen-0323d-4b5h8-control-plane-0   Ready    master   3h43m   v1.23.5+b0357ed
jechen-0323d-4b5h8-control-plane-1   Ready    master   3h43m   v1.23.5+b0357ed
jechen-0323d-4b5h8-control-plane-2   Ready    master   3h43m   v1.23.5+b0357ed


$ oc label node jechen-0323d-4b5h8-compute-0 "k8s.ovn.org/egress-assignable"=""

$ oc label node jechen-0323d-4b5h8-compute-1 "k8s.ovn.org/egress-assignable"=""

$ cat config_egressip1_ovn_ns_team_red.yaml
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
  name: egressip1
spec:
  egressIPs:
  - 172.31.248.101
  - 172.31.248.102
  - 172.31.248.103
  namespaceSelector:
    matchLabels:
      team: red 


$  oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red.yaml
egressip.k8s.ovn.org/egressip1 created


$  oc get egressip -oyaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
  kind: EgressIP
  metadata:
    creationTimestamp: "2022-03-24T02:07:51Z"
    generation: 2
    name: egressip1
    resourceVersion: "101084"
    uid: 523aad30-bdc4-49d4-b02e-9132c41b65b3
  spec:
    egressIPs:
    - 172.31.248.101
    - 172.31.248.102
    - 172.31.248.103
    namespaceSelector:
      matchLabels:
        team: red
  status:
    items:
    - egressIP: 172.31.248.103oc label ns test team=red
      node: jechen-0323d-4b5h8-compute-1
    - egressIP: 172.31.248.101
      node: jechen-0323d-4b5h8-compute-0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""


$ oc new-project test
$ oc label ns test team=red
$ oc create -f ./SDN-1332-test/list_for_pods.json


$ oc get pod -owide
NAME            READY   STATUS    RESTARTS   AGE    IP            NODE                           NOMINATED NODE   READINESS GATES
test-rc-24swt   1/1     Running   1          148m   10.128.2.20   jechen-0323d-4b5h8-compute-0   <none>           <none>
test-rc-4gj9l   1/1     Running   1          148m   10.128.2.17   jechen-0323d-4b5h8-compute-0   <none>           <none>
test-rc-66h4z   1/1     Running   1          148m   10.128.2.18   jechen-0323d-4b5h8-compute-0   <none>           <none>
test-rc-8dddf   1/1     Running   0          148m   10.131.0.22   jechen-0323d-4b5h8-compute-1   <none>           <none>
test-rc-kvnvq   1/1     Running   0          148m   10.131.0.19   jechen-0323d-4b5h8-compute-1   <none>           <none>
test-rc-mjt29   1/1     Running   0          148m   10.131.0.20   jechen-0323d-4b5h8-compute-1   <none>           <none>
test-rc-n25zg   1/1     Running   1          148m   10.128.2.19   jechen-0323d-4b5h8-compute-0   <none>           <none>
test-rc-pbp2w   1/1     Running   0          148m   10.131.0.18   jechen-0323d-4b5h8-compute-1   <none>           <none>
test-rc-q87fs   1/1     Running   1          148m   10.128.2.21   jechen-0323d-4b5h8-compute-0   <none>           <none>
test-rc-rl5td   1/1     Running   0          148m   10.131.0.21   jechen-0323d-4b5h8-compute-1   <none>           <none>

$ oc rsh test-rc-24swt
~ $  while true; do curl 172.31.249.80:9095;sleep 2; echo ""; done;
172.31.248.101
172.31.248.101
172.31.248.101
172.31.248.101
172.31.248.101
172.31.248.101
172.31.248.101


$ oc  -n openshift-ovn-kubernetes get pod
NAME                   READY   STATUS    RESTARTS        AGE
ovnkube-master-8qgqf   6/6     Running   6 (3h49m ago)   3h50m
ovnkube-master-mdwvq   6/6     Running   6 (3h49m ago)   3h50m
ovnkube-master-nmnbg   6/6     Running   0               3h50m
ovnkube-node-6dkjr     5/5     Running   11              3h38m
ovnkube-node-fktkn     5/5     Running   0               3h39m
ovnkube-node-gnd8v     5/5     Running   0               3h50m
ovnkube-node-r298d     5/5     Running   0               3h50m
ovnkube-node-vzh2l     5/5     Running   0               3h50m


$  oc -n openshift-ovn-kubernetes rsh ovnkube-master-nmnbg
Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker
sh-4.4#  ovn-nbctl lr-policy-list ovn_cluster_router  | grep "100 "
       100                             ip4.src == 10.128.2.17         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.18         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.19         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.20         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.21         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.18         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.19         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.20         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.21         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.22         reroute                100.64.0.5, 100.64.0.6
sh-4.4# 


$ oc debug node/jechen-0323d-4b5h8-compute-0
Starting pod/jechen-0323d-4b5h8-compute-0-debug ...
To use host binaries, run `chroot /host`
Pod IP: 172.31.248.18
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# reboot

Removing debug pod ...


$  oc -n openshift-ovn-kubernetes rsh ovnkube-master-nmnbg
Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker
sh-4.4#  ovn-nbctl lr-policy-list ovn_cluster_router  | grep "100 "
       100                             ip4.src == 10.128.2.17         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.18         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.19         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.20         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.21         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.18         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.19         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.20         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.21         reroute                100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.131.0.22         reroute                100.64.0.5, 100.64.0.6

verified no missing internal IP or duplicate record found

Comment 5 jechen 2022-03-24 02:43:14 UTC
Verified: Tested

Comment 9 errata-xmlrpc 2022-04-08 05:04:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.10.8 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1162


Note You need to log in before you can comment on or make changes to this bug.