Bug 2034513 - [OVN] After update one EgressIP in EgressIP object, one internal IP lost from lr-policy-list
Summary: [OVN] After update one EgressIP in EgressIP object, one internal IP lost from...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.10.0
Assignee: Ben Bennett
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-21 07:53 UTC by huirwang
Modified: 2022-03-10 16:36 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:35:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 917 0 None Merged Bug 2039099: EgressIP fixes for 4.10 2022-01-27 13:30:33 UTC
Github openshift ovn-kubernetes pull 923 0 None Merged Bug 2044303: Fix update of CloudPrivateIPConfig 2022-01-27 13:30:30 UTC
Github ovn-org ovn-kubernetes pull 2734 0 None Merged EgressIP: miscellaneous fixes 2022-01-24 14:46:22 UTC
Github ovn-org ovn-kubernetes pull 2766 0 None Merged EgressIP: Fix update of CloudPrivateIPConfig 2022-01-25 21:56:49 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:36:04 UTC

Description huirwang 2021-12-21 07:53:21 UTC
Description of problem:
Found this issue in ovn vsphere environment, and should be different from https://bugzilla.redhat.com/show_bug.cgi?id=2034097 even both of them are updating EgressIP object. Actually I didn't reproduce 2034097 in vsphere env. So open this to track the different wrong behavior 

Version-Release number of selected component (if applicable):
4.10.0-0.nightly-2021-12-18-034942 

How reproducible:
Always

Steps to Reproduce:
1. Tag 3 nodes as egress nodes

2.Create one egressip object
oc get egressip -o yaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
  kind: EgressIP
  metadata:
    creationTimestamp: "2021-12-21T07:21:18Z"
    generation: 2
    name: egressip-example6
    resourceVersion: "108167"
    uid: 9de73236-c1b2-4949-86cb-1358cebd2100
  spec:
    egressIPs:
    - 172.31.249.79
    - 172.31.249.246
    - 172.31.249.133
    namespaceSelector:
      matchLabels:
        team: red
    podSelector: {}
  status:
    items:
    - egressIP: 172.31.249.133
      node: control-plane-0
    - egressIP: 172.31.249.246
      node: compute-1
    - egressIP: 172.31.249.79
      node: compute-0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

3. Create namespace test and pod in it. Add team=red to namespace .

4. Check lr-policy-list
oc rsh -n openshift-ovn-kubernetes ovnkube-master-ct9qv 
Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker
sh-4.4# ovn-nbctl lr-policy-list ovn_cluster_router  | grep "100 " 
       100                             ip4.src == 10.128.2.22         reroute                100.64.0.2, 100.64.0.5, 100.64.0.6
       100                             ip4.src == 10.128.2.23         reroute                100.64.0.2, 100.64.0.5, 100.64.0.6

5. Update EgressIP object 172.31.249.79 to new IP, here 172.31.249.157
oc get egressip -o yaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
  kind: EgressIP
  metadata:
    creationTimestamp: "2021-12-21T07:21:18Z"
    generation: 4
    name: egressip-example6
    resourceVersion: "108823"
    uid: 9de73236-c1b2-4949-86cb-1358cebd2100
  spec:
    egressIPs:
    - 172.31.249.157
    - 172.31.249.246
    - 172.31.249.133
    namespaceSelector:
      matchLabels:
        team: red
    podSelector: {}
  status:
    items:
    - egressIP: 172.31.249.133
      node: control-plane-0
    - egressIP: 172.31.249.246
      node: compute-1
    - egressIP: 172.31.249.157
      node: compute-0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

6. Check lr-policy-list again. 





Actual results:
Found the internal IPs of the  lr-policy-list only two.
$ oc rsh -n openshift-ovn-kubernetes ovnkube-master-ct9qv 
Defaulted container "northd" out of: northd, nbdb, kube-rbac-proxy, sbdb, ovnkube-master, ovn-dbchecker
sh-4.4#  ovn-nbctl lr-policy-list ovn_cluster_router  | grep "100 " 
       100                             ip4.src == 10.128.2.22         reroute                100.64.0.2, 100.64.0.5
       100                             ip4.src == 10.128.2.23         reroute                100.64.0.2, 100.64.0.5

And curl the outside from matched pod, the load balancing only missed the new EgressIP.

 oc rsh -n test test-rc-8p58n
~ $  while true; do curl 172.31.249.80:9095 --connect-timeout 2 ; echo "";sleep 2; done
172.31.249.133
172.31.249.133
172.31.249.133
172.31.249.246
172.31.249.246
172.31.249.133
172.31.249.246
172.31.249.246
172.31.249.246
172.31.249.133
172.31.249.246


Expected results:
The updated EgressIP can take effect.

Additional info:

Comment 11 errata-xmlrpc 2022-03-10 16:35:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.