Bug 2038596 - Auto egressIP for OVN cluster on GCP: After egressIP object is deleted, egressIP still takes effect
Summary: Auto egressIP for OVN cluster on GCP: After egressIP object is deleted, egres...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Ben Bennett
QA Contact: jechen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-08 19:16 UTC by jechen
Modified: 2022-03-10 16:38 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:37:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 917 0 None open Bug 2039099: EgressIP fixes for 4.10 2022-01-19 21:14:16 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:38:20 UTC

Description jechen 2022-01-08 19:16:19 UTC
Description of problem:
on GCP cluster, after deleting egressIP object, egressIP still takes effect

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-08-114825   True        False         46m     Cluster version is 4.10.0-0.nightly-2022-01-08-114825


How reproducible:


Steps to Reproduce:
1. label node to be egress-assignable
$ oc get node
NAME                                                        STATUS   ROLES    AGE   VERSION
jechen-0108b-48q62-master-0.c.openshift-qe.internal         Ready    master   66m   v1.22.1+6859754
jechen-0108b-48q62-master-1.c.openshift-qe.internal         Ready    master   65m   v1.22.1+6859754
jechen-0108b-48q62-master-2.c.openshift-qe.internal         Ready    master   65m   v1.22.1+6859754
jechen-0108b-48q62-worker-a-vztns.c.openshift-qe.internal   Ready    worker   55m   v1.22.1+6859754
jechen-0108b-48q62-worker-b-jb6fc.c.openshift-qe.internal   Ready    worker   55m   v1.22.1+6859754

$ oc label node jechen-0108b-48q62-worker-a-vztns.c.openshift-qe.internal   "k8s.ovn.org/egress-assignable"=""
node/jechen-0108b-48q62-worker-a-vztns.c.openshift-qe.internal labeled

2. create egressIP object
$ oc describe node jechen-0108b-48q62-worker-a-vztns.c.openshift-qe.internal
Annotations:        cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.128.0/17"},"capacity":{"ip":10}}]
                    csi.volume.kubernetes.io/nodeid:
                      {"pd.csi.storage.gke.io":"projects/openshift-qe/zones/us-central1-a/instances/jechen-0108b-48q62-worker-a-vztns"}
                    k8s.ovn.org/host-addresses: ["10.0.128.2"]


$ more config_egressip1_ovn_ns_team_red.yaml 
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
  name: egressip1
spec:
  egressIPs:
  - 10.0.128.101
  namespaceSelector:
    matchLabels:
      team: red 


$ oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red.yaml 
egressip.k8s.ovn.org/egressip1 created

$ oc get egressip
NAME        EGRESSIPS      ASSIGNED NODE                                               ASSIGNED EGRESSIPS
egressip1   10.0.128.101   jechen-0108b-48q62-worker-a-vztns.c.openshift-qe.internal   10.0.128.101


3. create new test project and test pods in it, label the project
$ oc new-project test
$ oc create -f /home/jechen/automation-work/verification-tests/testdata/networking/list_for_pods.json 
replicationcontroller/test-rc created
service/test-service created


$ oc label ns test team=red
namespace/test labeled

$ oc get pod
NAME            READY   STATUS    RESTARTS   AGE
test-rc-22kj2   1/1     Running   0          9m51s
test-rc-dkcl5   1/1     Running   0          9m51s

[jechen@jechen ~]$ oc rsh test-rc-22kj2
cc~ $ curl 10.0.0.2:8888
10.0.128.101~

4. Delete egressIP object
$ oc delete egressip egressip1 
egressip.k8s.ovn.org "egressip1" deleted

$ oc get egressip
No resources found

curl external ipecho service, it still returns egressIP address as source IP
$ oc rsh test-rc-22kj2
~ $ curl 10.0.0.2:8888
10.0.128.101~

Actual results:
after deleting egressIP object, egressIP still takes effect

Expected results:
after deleting egressIP object, egressIP should be removed, curl external from test pod should return node's IP address as sourceIP, not egressIP address

Additional info:

Comment 6 jechen 2022-01-28 01:28:48 UTC
Verified with 4.10.0-0.nightly-2022-01-27-144113

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-27-144113   True        False         8m15s   Cluster version is 4.10.0-0.nightly-2022-01-27-144113

$ oc get node
NAME                                                        STATUS   ROLES    AGE   VERSION
jechen-0127e-4shd2-master-0.c.openshift-qe.internal         Ready    master   27m   v1.23.0+d30ebbc
jechen-0127e-4shd2-master-1.c.openshift-qe.internal         Ready    master   28m   v1.23.0+d30ebbc
jechen-0127e-4shd2-master-2.c.openshift-qe.internal         Ready    master   27m   v1.23.0+d30ebbc
jechen-0127e-4shd2-worker-a-t6zjk.c.openshift-qe.internal   Ready    worker   17m   v1.23.0+d30ebbc
jechen-0127e-4shd2-worker-b-87xll.c.openshift-qe.internal   Ready    worker   17m   v1.23.0+d30ebbc
jechen-0127e-4shd2-worker-c-6mr84.c.openshift-qe.internal   Ready    worker   17m   v1.23.0+d30ebbc


$ oc label node jechen-0127e-4shd2-worker-a-t6zjk.c.openshift-qe.internal "k8s.ovn.org/egress-assignable"=""
node/jechen-0127e-4shd2-worker-a-t6zjk.c.openshift-qe.internal labeled

$ oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red.yaml 
egressip.k8s.ovn.org/egressip1 created
 
$ oc get egressip
NAME        EGRESSIPS      ASSIGNED NODE                                               ASSIGNED EGRESSIPS
egressip1   10.0.128.101   jechen-0127e-4shd2-worker-a-t6zjk.c.openshift-qe.internal   10.0.128.101


$ oc new-project test

$ oc label ns test team=red
namespace/test labeled

$ oc create -f ./SDN-1332-test/list_for_pods.json 
replicationcontroller/test-rc created
service/test-service created
 
$ oc get pod
NAME            READY   STATUS              RESTARTS   AGE
test-rc-hv5rk   0/1     ContainerCreating   0          3s
test-rc-q58tk   0/1     ContainerCreating   0          3s
test-rc-x674t   0/1     ContainerCreating   0          3s

$ oc rsh test-rc-hv5rk
~ $ curl 10.0.0.2:8888
10.0.128.101~ $ 
~ $ exit

$ oc rsh test-rc-q58tk
~ $ curl 10.0.0.2:8888
10.0.128.101~ $ 
~ $ exit


$ oc rsh test-rc-x674t
~ $ curl 10.0.0.2:8888
10.0.128.101~ $ 
~ $ exit

$ oc delete egressip  egressip1
egressip.k8s.ovn.org "egressip1" deleted

$ oc get egressip
No resources found

$ oc rsh test-rc-hv5rk
~ $ curl 10.0.0.2:8888
10.0.128.4
~ $  exit

$ oc rsh test-rc-q58tk 
~ $ curl 10.0.0.2:8888
10.0.128.3~ $ 
~ $ exit


$ oc rsh test-rc-x674t
~ $ curl 10.0.0.2:8888
10.0.128.2~ $ 
~ $ exit


$ oc get pod -owide
NAME            READY   STATUS    RESTARTS   AGE     IP            NODE                                                        NOMINATED NODE   READINESS GATES
test-rc-hv5rk   1/1     Running   0          3m31s   10.129.2.11   jechen-0127e-4shd2-worker-b-87xll.c.openshift-qe.internal   <none>           <none>
test-rc-q58tk   1/1     Running   0          3m31s   10.128.2.11   jechen-0127e-4shd2-worker-a-t6zjk.c.openshift-qe.internal   <none>           <none>
test-rc-x674t   1/1     Running   0          3m31s   10.131.0.25   jechen-0127e-4shd2-worker-c-6mr84.c.openshift-qe.internal   <none>           <none>

$ oc get node -owide
NAME                                                        STATUS   ROLES    AGE   VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                        KERNEL-VERSION                 CONTAINER-RUNTIME
jechen-0127e-4shd2-master-0.c.openshift-qe.internal         Ready    master   55m   v1.23.0+d30ebbc   10.0.0.5      <none>        Red Hat Enterprise Linux CoreOS 410.84.202201271015-0 (Ootpa)   4.18.0-305.34.2.el8_4.x86_64   cri-o://1.23.0-106.rhaos4.10.gitdb89312.el8
jechen-0127e-4shd2-master-1.c.openshift-qe.internal         Ready    master   55m   v1.23.0+d30ebbc   10.0.0.6      <none>        Red Hat Enterprise Linux CoreOS 410.84.202201271015-0 (Ootpa)   4.18.0-305.34.2.el8_4.x86_64   cri-o://1.23.0-106.rhaos4.10.gitdb89312.el8
jechen-0127e-4shd2-master-2.c.openshift-qe.internal         Ready    master   54m   v1.23.0+d30ebbc   10.0.0.7      <none>        Red Hat Enterprise Linux CoreOS 410.84.202201271015-0 (Ootpa)   4.18.0-305.34.2.el8_4.x86_64   cri-o://1.23.0-106.rhaos4.10.gitdb89312.el8
jechen-0127e-4shd2-worker-a-t6zjk.c.openshift-qe.internal   Ready    worker   44m   v1.23.0+d30ebbc   10.0.128.3    <none>        Red Hat Enterprise Linux CoreOS 410.84.202201271015-0 (Ootpa)   4.18.0-305.34.2.el8_4.x86_64   cri-o://1.23.0-106.rhaos4.10.gitdb89312.el8
jechen-0127e-4shd2-worker-b-87xll.c.openshift-qe.internal   Ready    worker   44m   v1.23.0+d30ebbc   10.0.128.4    <none>        Red Hat Enterprise Linux CoreOS 410.84.202201271015-0 (Ootpa)   4.18.0-305.34.2.el8_4.x86_64   cri-o://1.23.0-106.rhaos4.10.gitdb89312.el8
jechen-0127e-4shd2-worker-c-6mr84.c.openshift-qe.internal   Ready    worker   44m   v1.23.0+d30ebbc   10.0.128.2    <none>        Red Hat Enterprise Linux CoreOS 410.84.202201271015-0 (Ootpa)   4.18.0-305.34.2.el8_4.x86_64   cri-o://1.23.0-106.rhaos4.10.gitdb89312.el8

After egressip object is deleted, curl external from test pod returns node's IP address as sourceIP correctly.

Comment 10 errata-xmlrpc 2022-03-10 16:37:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.