Bug 2043802

Summary: EgressIP stopped working after single egressIP for a netnamespace is switched to the other node of HA pair after the first egress node is shutdown
Product: OpenShift Container Platform Reporter: jechen <jechen>
Component: NetworkingAssignee: Patryk Diak <pdiak>
Networking sub component: openshift-sdn QA Contact: jechen <jechen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: pdiak
Version: 4.10   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:41:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jechen 2022-01-22 02:33:37 UTC
Description of problem:
Two EgressIP nodes are configured with EgressCIDRs as a HA pair, netnamespace has a single egressIP configured.  Pods using the netnamespace use the first egressIP node for outgoing traffic.  But after the first egressIP node is shutdown, and egressIP is switched to the second egress node, egressIP stopped working, pod can not access outside.

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-21-192636   True        False         18m     Cluster version is 4.10.0-0.nightly-2022-01-21-192636


$ oc get node

NAME                                                        STATUS   ROLES    AGE   VERSION
jechen-0121d-564rn-master-0.c.openshift-qe.internal         Ready    master   34m   v1.23.0+112af52
jechen-0121d-564rn-master-1.c.openshift-qe.internal         Ready    master   34m   v1.23.0+112af52
jechen-0121d-564rn-master-2.c.openshift-qe.internal         Ready    master   34m   v1.23.0+112af52
jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   Ready    worker   25m   v1.23.0+112af52
jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   Ready    worker   26m   v1.23.0+112af52
jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   Ready    worker   25m   v1.23.0+112af52


$ oc describe node jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal
Name:               jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=n1-standard-4
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-central1
                    failure-domain.beta.kubernetes.io/zone=us-central1-a
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=n1-standard-4
                    node.openshift.io/os_id=rhcos
                    topology.gke.io/zone=us-central1-a
                    topology.kubernetes.io/region=us-central1
                    topology.kubernetes.io/zone=us-central1-a
Annotations:        cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.128.0/17"},"capacity":{"ip":10}}]
                    csi.volume.kubernetes.io/nodeid:
                      {"pd.csi.storage.gke.io":"projects/openshift-qe/zones/us-central1-a/instances/jechen-0121d-564rn-worker-a-89mhb"}


How reproducible:


Steps to Reproduce:
1.Configure two nodes as the egress node 
$ oc patch hostsubnet jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal patched


$ oc patch hostsubnet jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal patched

$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0121d-564rn-master-0.c.openshift-qe.internal         jechen-0121d-564rn-master-0.c.openshift-qe.internal         10.0.0.5     10.130.0.0/23                       
jechen-0121d-564rn-master-1.c.openshift-qe.internal         jechen-0121d-564rn-master-1.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-0121d-564rn-master-2.c.openshift-qe.internal         jechen-0121d-564rn-master-2.c.openshift-qe.internal         10.0.0.7     10.129.0.0/23                       
jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   10.0.128.4   10.129.2.0/23   ["10.0.128.0/17"]   
jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   10.0.128.2   10.131.0.0/23   ["10.0.128.0/17"]   
jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23                      

2. configure a test project, and configure netnamespace to it, configure test pods in the project
$ oc new-project test

$ oc patch netnamespace test --type=merge -p '{"egressIPs":["10.0.128.100"]}'
netnamespace.network.openshift.io/test patched

$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0121d-564rn-master-0.c.openshift-qe.internal         jechen-0121d-564rn-master-0.c.openshift-qe.internal         10.0.0.5     10.130.0.0/23                       
jechen-0121d-564rn-master-1.c.openshift-qe.internal         jechen-0121d-564rn-master-1.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-0121d-564rn-master-2.c.openshift-qe.internal         jechen-0121d-564rn-master-2.c.openshift-qe.internal         10.0.0.7     10.129.0.0/23                       
jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   10.0.128.4   10.129.2.0/23   ["10.0.128.0/17"]   ["10.0.128.100"]
jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   10.0.128.2   10.131.0.0/23   ["10.0.128.0/17"]   
jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23                       

$ oc create -f ./SDN-1332-test/list_for_pods.json 
replicationcontroller/test-rc created
service/test-service created

$ oc get pod -owide
NAME            READY   STATUS              RESTARTS   AGE   IP       NODE                                                        NOMINATED NODE   READINESS GATES
test-rc-kzlh7   0/1     ContainerCreating   0          5s    <none>   jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   <none>           <none>
test-rc-qcdrx   0/1     ContainerCreating   0          5s    <none>   jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   <none>           <none>
test-rc-tpktp   0/1     ContainerCreating   0          5s    <none>   jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   <none>           <none>


3.  while egressIP is on node jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal, from each test pod curl external ip-echo service, egressIP is returned as source IP correctly

$ oc rsh test-rc-qcdrx
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ 
~ $ exit

$ oc rsh test-rc-kzlh7
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ exit

$ oc rsh test-rc-tpktp
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ exit


4.  shutdown the current egress node jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal

$ oc debug node/jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal
Starting pod/jechen-0121d-564rn-worker-a-89mhbcopenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.128.4
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# shutdown
Shutdown scheduled for Sat 2022-01-22 01:54:27 UTC, use 'shutdown -c' to cancel.
sh-4.4# 
Removing debug pod ...


5.  Wait a little bit, check hostsubnet, egressIP is switched to the second egress node correctly
$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0121d-564rn-master-0.c.openshift-qe.internal         jechen-0121d-564rn-master-0.c.openshift-qe.internal         10.0.0.5     10.130.0.0/23                       
jechen-0121d-564rn-master-1.c.openshift-qe.internal         jechen-0121d-564rn-master-1.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-0121d-564rn-master-2.c.openshift-qe.internal         jechen-0121d-564rn-master-2.c.openshift-qe.internal         10.0.0.7     10.129.0.0/23                       
jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   jechen-0121d-564rn-worker-a-89mhb.c.openshift-qe.internal   10.0.128.4   10.129.2.0/23   ["10.0.128.0/17"]   
jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   jechen-0121d-564rn-worker-b-78mbp.c.openshift-qe.internal   10.0.128.2   10.131.0.0/23   ["10.0.128.0/17"]   ["10.0.128.100"]
jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   jechen-0121d-564rn-worker-c-x4b85.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23                       


6. from test pods curl external ip-echo service, 
$ oc rsh test-rc-qcdrx
Error from server: error dialing backend: dial tcp 10.0.128.4:10250: i/o timeout

$ oc rsh test-rc-kzlh7
~ $ curl 10.0.0.2:8888
^C
~ $  curl 10.0.0.2:8888
^C
~ $ exit
command terminated with exit code 130

$ oc rsh test-rc-tpktp
~ $ curl 10.0.0.2:8888
^C~ $ exit
command terminated with exit code 130



Actual results:
egressIP no longer works after egressIP is switched to the second node

Expected results:
egressIP should continue working after egressIP is switched to the second node, curl external ip-echo service should see egressIP returned as source IP.

Additional info:

Comment 1 Patryk Diak 2022-01-24 09:32:39 UTC
Please share the must-gather

Comment 6 jechen 2022-01-27 16:15:37 UTC
Verified in 4.10.0-0.nightly-2022-01-27-104747

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-27-104747   True        False         2m39s   Cluster version is 4.10.0-0.nightly-2022-01-27-104747

$ oc get node
NAME                                                        STATUS   ROLES    AGE   VERSION
jechen-0127b-55qb8-master-0.c.openshift-qe.internal         Ready    master   17m   v1.23.0+d30ebbc
jechen-0127b-55qb8-master-1.c.openshift-qe.internal         Ready    master   17m   v1.23.0+d30ebbc
jechen-0127b-55qb8-master-2.c.openshift-qe.internal         Ready    master   17m   v1.23.0+d30ebbc
jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   Ready    worker   10m   v1.23.0+d30ebbc
jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   Ready    worker   10m   v1.23.0+d30ebbc
jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   Ready    worker   10m   v1.23.0+d30ebbc


$ oc patch hostsubnet jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal patched

$ oc patch hostsubnet jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}'
hostsubnet.network.openshift.io/jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal patched

$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0127b-55qb8-master-0.c.openshift-qe.internal         jechen-0127b-55qb8-master-0.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-0127b-55qb8-master-1.c.openshift-qe.internal         jechen-0127b-55qb8-master-1.c.openshift-qe.internal         10.0.0.7     10.130.0.0/23                       
jechen-0127b-55qb8-master-2.c.openshift-qe.internal         jechen-0127b-55qb8-master-2.c.openshift-qe.internal         10.0.0.5     10.129.0.0/23                       
jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   10.0.128.2   10.129.2.0/23   ["10.0.128.0/17"]   
jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23   ["10.0.128.0/17"]   
jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   10.0.128.4   10.131.0.0/23                    


$ oc new-project test

$ oc patch netnamespace test --type=merge -p '{"egressIPs":["10.0.128.100"]}'
netnamespace.network.openshift.io/test patched


$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0127b-55qb8-master-0.c.openshift-qe.internal         jechen-0127b-55qb8-master-0.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-0127b-55qb8-master-1.c.openshift-qe.internal         jechen-0127b-55qb8-master-1.c.openshift-qe.internal         10.0.0.7     10.130.0.0/23                       
jechen-0127b-55qb8-master-2.c.openshift-qe.internal         jechen-0127b-55qb8-master-2.c.openshift-qe.internal         10.0.0.5     10.129.0.0/23                       
jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   10.0.128.2   10.129.2.0/23   ["10.0.128.0/17"]   ["10.0.128.100"]
jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23   ["10.0.128.0/17"]   
jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   10.0.128.4   10.131.0.0/23           


$ oc create -f ./SDN-1332-test/list_for_pods.json 
replicationcontroller/test-rc created
service/test-service created
 
$ oc get pod -owide
NAME            READY   STATUS              RESTARTS   AGE   IP       NODE                                                        NOMINATED NODE   READINESS GATES
test-rc-48s46   0/1     ContainerCreating   0          5s    <none>   jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   <none>           <none>
test-rc-5l6nc   0/1     ContainerCreating   0          5s    <none>   jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   <none>           <none>
test-rc-fz5l6   0/1     ContainerCreating   0          5s    <none>   jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   <none>           <none>


$ oc rsh test-rc-48s46
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ exit
$ oc rsh test-rc-5l6nc
~ $  curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $  curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $  curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ exit
$ oc rsh test-rc-fz5l6
~ $  curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $  curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $  curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ exit

$ oc debug node/jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal
Starting pod/jechen-0127b-55qb8-worker-a-8m784copenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.128.2
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# shutdown
Shutdown scheduled for Thu 2022-01-27 16:06:15 UTC, use 'shutdown -c' to cancel.
sh-4.4# 
Removing debug pod ...


$ oc get hostsubnet
NAME                                                        HOST                                                        HOST IP      SUBNET          EGRESS CIDRS        EGRESS IPS
jechen-0127b-55qb8-master-0.c.openshift-qe.internal         jechen-0127b-55qb8-master-0.c.openshift-qe.internal         10.0.0.6     10.128.0.0/23                       
jechen-0127b-55qb8-master-1.c.openshift-qe.internal         jechen-0127b-55qb8-master-1.c.openshift-qe.internal         10.0.0.7     10.130.0.0/23                       
jechen-0127b-55qb8-master-2.c.openshift-qe.internal         jechen-0127b-55qb8-master-2.c.openshift-qe.internal         10.0.0.5     10.129.0.0/23                       
jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal   10.0.128.2   10.129.2.0/23   ["10.0.128.0/17"]   
jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal   10.0.128.3   10.128.2.0/23   ["10.0.128.0/17"]   ["10.0.128.100"]
jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal   10.0.128.4   10.131.0.0/23                       

$ oc rsh test-rc-48s46
Error from server: error dialing backend: dial tcp 10.0.128.2:10250: i/o timeout

$ oc rsh test-rc-5l6nc
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ 
~ $ exit

$ oc rsh test-rc-fz5l6
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ curl 10.0.0.2:8888
10.0.128.100~ $ 
~ $ exit

Comment 9 errata-xmlrpc 2022-03-10 16:41:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056