Bug 2035439 - SDN Automatic assignment EgressIP on GCP returned node IP adress not egressIP address
Summary: SDN Automatic assignment EgressIP on GCP returned node IP adress not egressI...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.10.0
Assignee: Patryk Diak
QA Contact: jechen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-24 03:35 UTC by jechen
Modified: 2022-03-12 04:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-12 04:40:05 UTC
Target Upstream Version:
Embargoed:
pdiak: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sdn pull 387 0 None open Bug 2035439: Use cloud egress network config for assigning egress IP in cloud environment 2022-01-10 13:07:09 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-12 04:40:21 UTC

Description jechen 2021-12-24 03:35:18 UTC
Description of problem:
On a OpenShiftSDN cluster created on GCP, automatic assignment egressIP is configured, curl external ipecho service, node's IP address instead of configured egressIP address is returned

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-12-23-153012   True        False         6h20m   Cluster version is 4.10.0-0.nightly-2021-12-23-153012

$ oc get node
NAME                                                        STATUS   ROLES    AGE     VERSION
jechen-1223b-dkcdl-master-0.c.openshift-qe.internal         Ready    master   6h38m   v1.22.1+6859754
jechen-1223b-dkcdl-master-1.c.openshift-qe.internal         Ready    master   6h38m   v1.22.1+6859754
jechen-1223b-dkcdl-master-2.c.openshift-qe.internal         Ready    master   6h38m   v1.22.1+6859754
jechen-1223b-dkcdl-worker-a-qd45f.c.openshift-qe.internal   Ready    worker   6h31m   v1.22.1+6859754
jechen-1223b-dkcdl-worker-b-mqqn6.c.openshift-qe.internal   Ready    worker   6h30m   v1.22.1+6859754


How reproducible:


Steps to Reproduce:
1. get node annotation
Annotations:        cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.128.0/17"},"capacity":{"ip":10}}]
                    csi.volume.kubernetes.io/nodeid:

2. patch hosts with egressCIDR
$ oc patch hostsubnet jechen-1223b-dkcdl-worker-a-qd45f.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-1223b-dkcdl-worker-a-qd45f.c.openshift-qe.internal patched

$ oc patch hostsubnet jechen-1223b-dkcdl-worker-b-mqqn6.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-1223b-dkcdl-worker-b-mqqn6.c.openshift-qe.internal patched


3. patch project with egressIP
$ oc new-project test

$ oc patch netnamespace test --type=merge -p '{"egressIPs":["10.0.128.101"]}'
netnamespace.network.openshift.io/test patched

$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-1223b-dkcdl-master-0.c.openshift-qe.internal         jechen-1223b-dkcdl-master-0.c.openshift-qe.internal         10.0.0.7     10.130.0.0/23                       
jechen-1223b-dkcdl-master-1.c.openshift-qe.internal         jechen-1223b-dkcdl-master-1.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-1223b-dkcdl-master-2.c.openshift-qe.internal         jechen-1223b-dkcdl-master-2.c.openshift-qe.internal         10.0.0.5     10.129.0.0/23                       
jechen-1223b-dkcdl-worker-a-qd45f.c.openshift-qe.internal   jechen-1223b-dkcdl-worker-a-qd45f.c.openshift-qe.internal   10.0.128.2   10.131.0.0/23   ["10.0.128.0/17"]   ["10.0.128.101"]
jechen-1223b-dkcdl-worker-b-mqqn6.c.openshift-qe.internal   jechen-1223b-dkcdl-worker-b-mqqn6.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23   ["10.0.128.0/17"]  


Installed ipecho service on an int_svc instance that created with the GCP cluster
ssh to the int_svc instance:
sudo yum install docker
sudo systemctl start docker
sudo docker run --name ipecho -d -p 8888:80 docker.io/aosqe/ip-echo
add port 88888 as allowed port to firewall rules


4. create test pod, and curl the ipecho service 
$ oc create -f /home/jechen/automation-work/verification-tests/testdata/networking/list_for_pods.json 
replicationcontroller/test-rc created

$ oc get all
NAME                READY   STATUS    RESTARTS   AGE
pod/test-rc-md9b4   1/1     Running   0          5h38m
pod/test-rc-nmvkz   1/1     Running   0          5h38m

NAME                            DESIRED   CURRENT   READY   AGE
replicationcontroller/test-rc   2         2         2       5h38m

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
service/test-service   ClusterIP   172.30.98.163   <none>        27017/TCP   5h38m


$ oc rsh test-rc-md9b4
~ $ curl 10.0.0.2:8888
10.0.128.2

$ oc debug node/jechen-1223b-dkcdl-worker-a-qd45f.c.openshift-qe.internal
Starting pod/jechen-1223b-dkcdl-worker-a-qd45fcopenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.128.2
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# curl 10.0.0.2:8888
10.0.128.2


Actual results:
curl the ipecho service returned the host IP address

Expected results:
curl the ipecho service should return egressIP address 10.0.128.101

Additional info:

Comment 1 jechen 2022-01-12 00:17:52 UTC
manual egressIP assignent with OpenShift-SDN cluster on GCP is having same problem.

Comment 3 Patryk Diak 2022-01-12 13:08:09 UTC
@jechen I have made changed to the PR. Please let me know if the issue still occurs and if so share the reproduction steps

Comment 4 jechen 2022-01-13 01:56:34 UTC
@pdiak  In order to test your PR pre-merged, I have to use cluster-bot to build a cluster, but the cluster-bot does not give me an external VM instance where I can install ipecho service to verify egressIP.   Normally, I use Jenkins to build a GCP cluster, and  I specify an external VM instance being built along with the cluster by Jenkins.  Then I install ipecho service on the external VM instance.  I am not able to figure out a way to have cluster-bot not only build me a cluster, but also build me an external VM instance.   I think I will have to wait till the PR being merged before I can test the PR.

Comment 7 jechen 2022-01-15 19:29:00 UTC
Verified in 4.10.0-0.nightly-2022-01-15-092722


$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-15-092722   True        False         10m     Cluster version is 4.10.0-0.nightly-2022-01-15-092722

$ oc get node
NAME                                                        STATUS   ROLES    AGE   VERSION
jechen-0115d-zmlrk-master-0.c.openshift-qe.internal         Ready    master   30m   v1.23.0+60f5a1c
jechen-0115d-zmlrk-master-1.c.openshift-qe.internal         Ready    master   30m   v1.23.0+60f5a1c
jechen-0115d-zmlrk-master-2.c.openshift-qe.internal         Ready    master   30m   v1.23.0+60f5a1c
jechen-0115d-zmlrk-worker-a-khvzf.c.openshift-qe.internal   Ready    worker   20m   v1.23.0+60f5a1c
jechen-0115d-zmlrk-worker-b-4pf7t.c.openshift-qe.internal   Ready    worker   20m   v1.23.0+60f5a1c

$ oc patch hostsubnet jechen-0115d-zmlrk-worker-a-khvzf.c.openshift-qe.internal  --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-0115d-zmlrk-worker-a-khvzf.c.openshift-qe.internal patched

$ oc new-project test
Now using project "test" on server "https://api.jechen-0115d.qe.gcp.devcluster.openshift.com:6443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app rails-postgresql-example

to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:

    kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname


$ oc patch netnamespace test --type=merge -p '{"egressIPs":["10.0.128.101"]}'
netnamespace.network.openshift.io/test patched

$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0115d-zmlrk-master-0.c.openshift-qe.internal         jechen-0115d-zmlrk-master-0.c.openshift-qe.internal         10.0.0.5     10.129.0.0/23                       
jechen-0115d-zmlrk-master-1.c.openshift-qe.internal         jechen-0115d-zmlrk-master-1.c.openshift-qe.internal         10.0.0.6     10.130.0.0/23                       
jechen-0115d-zmlrk-master-2.c.openshift-qe.internal         jechen-0115d-zmlrk-master-2.c.openshift-qe.internal         10.0.0.7     10.128.0.0/23                       
jechen-0115d-zmlrk-worker-a-khvzf.c.openshift-qe.internal   jechen-0115d-zmlrk-worker-a-khvzf.c.openshift-qe.internal   10.0.128.2   10.131.0.0/23   ["10.0.128.0/17"]   ["10.0.128.101"]
jechen-0115d-zmlrk-worker-b-4pf7t.c.openshift-qe.internal   jechen-0115d-zmlrk-worker-b-4pf7t.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23                       

# create test project and test pods
$ oc create -f ./verification-tests/testdata/networking/list_for_pods.json 
replicationcontroller/test-rc created
service/test-service created

$ oc get pod
NAME            READY   STATUS    RESTARTS   AGE
test-rc-7m5bh   1/1     Running   0          9m18s
test-rc-hv69h   1/1     Running   0          9m18s

#curl the ip echo service from inside of test pod
$ oc rsh test-rc-7m5bh
~ $ curl 10.0.0.2:8888
10.0.128.101~ $              <----- egressIP address is returned correctly
$ exit

# remove the egressIP, then curl the ip echo service from inside of test pods
$ oc patch netnamespace test --type=merge -p '{"egressIPs":[]}'
netnamespace.network.openshift.io/test patched
 
$ oc rsh test-rc-7m5bh
~ $ curl 10.0.0.2:8888
10.0.128.3~ $ 
~ $ exit

$ oc rsh test-rc-hv69h
~ $ curl 10.0.0.2:8888
10.0.128.2~ $  
~ $ exit

# added egressIP back, curl ip echo service from inside of test pods
$ oc patch netnamespace test --type=merge -p '{"egressIPs":["10.0.128.101"]}'
netnamespace.network.openshift.io/test patched

$ oc rsh test-rc-7m5bh
~ $ curl 10.0.0.2:8888
10.0.128.101~ $              <----- egressIP address is returned correctly
~ $ exit

$ oc rsh test-rc-hv69h
~ $ curl 10.0.0.2:8888
10.0.128.101~ $              <----- egressIP address is returned correctly
~ $ exit

Comment 10 errata-xmlrpc 2022-03-12 04:40:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.