Bug 2043802
Summary: | EgressIP stopped working after single egressIP for a netnamespace is switched to the other node of HA pair after the first egress node is shutdown | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | jechen <jechen> |
Component: | Networking | Assignee: | Patryk Diak <pdiak> |
Networking sub component: | openshift-sdn | QA Contact: | jechen <jechen> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | unspecified | CC: | pdiak |
Version: | 4.10 | ||
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:41:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
jechen
2022-01-22 02:33:37 UTC
Please share the must-gather Verified in 4.10.0-0.nightly-2022-01-27-104747 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-27-104747 True False 2m39s Cluster version is 4.10.0-0.nightly-2022-01-27-104747 $ oc get node NAME STATUS ROLES AGE VERSION jechen-0127b-55qb8-master-0.c.openshift-qe.internal Ready master 17m v1.23.0+d30ebbc jechen-0127b-55qb8-master-1.c.openshift-qe.internal Ready master 17m v1.23.0+d30ebbc jechen-0127b-55qb8-master-2.c.openshift-qe.internal Ready master 17m v1.23.0+d30ebbc jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal Ready worker 10m v1.23.0+d30ebbc jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal Ready worker 10m v1.23.0+d30ebbc jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal Ready worker 10m v1.23.0+d30ebbc $ oc patch hostsubnet jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}' hostsubnet.network.openshift.io/jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal patched $ oc patch hostsubnet jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal --type=merge -p '{"egressCIDRs":["10.0.128.0/17"]}' hostsubnet.network.openshift.io/jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal patched $ oc get hostsubnet NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS jechen-0127b-55qb8-master-0.c.openshift-qe.internal jechen-0127b-55qb8-master-0.c.openshift-qe.internal 10.0.0.6 10.128.0.0/23 jechen-0127b-55qb8-master-1.c.openshift-qe.internal jechen-0127b-55qb8-master-1.c.openshift-qe.internal 10.0.0.7 10.130.0.0/23 jechen-0127b-55qb8-master-2.c.openshift-qe.internal jechen-0127b-55qb8-master-2.c.openshift-qe.internal 10.0.0.5 10.129.0.0/23 jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal 10.0.128.2 10.129.2.0/23 ["10.0.128.0/17"] jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal 10.0.128.3 10.128.2.0/23 ["10.0.128.0/17"] jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal 10.0.128.4 10.131.0.0/23 $ oc new-project test $ oc patch netnamespace test --type=merge -p '{"egressIPs":["10.0.128.100"]}' netnamespace.network.openshift.io/test patched $ oc get hostsubnet NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS jechen-0127b-55qb8-master-0.c.openshift-qe.internal jechen-0127b-55qb8-master-0.c.openshift-qe.internal 10.0.0.6 10.128.0.0/23 jechen-0127b-55qb8-master-1.c.openshift-qe.internal jechen-0127b-55qb8-master-1.c.openshift-qe.internal 10.0.0.7 10.130.0.0/23 jechen-0127b-55qb8-master-2.c.openshift-qe.internal jechen-0127b-55qb8-master-2.c.openshift-qe.internal 10.0.0.5 10.129.0.0/23 jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal 10.0.128.2 10.129.2.0/23 ["10.0.128.0/17"] ["10.0.128.100"] jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal 10.0.128.3 10.128.2.0/23 ["10.0.128.0/17"] jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal 10.0.128.4 10.131.0.0/23 $ oc create -f ./SDN-1332-test/list_for_pods.json replicationcontroller/test-rc created service/test-service created $ oc get pod -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-rc-48s46 0/1 ContainerCreating 0 5s <none> jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal <none> <none> test-rc-5l6nc 0/1 ContainerCreating 0 5s <none> jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal <none> <none> test-rc-fz5l6 0/1 ContainerCreating 0 5s <none> jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal <none> <none> $ oc rsh test-rc-48s46 ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ exit $ oc rsh test-rc-5l6nc ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ exit $ oc rsh test-rc-fz5l6 ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ exit $ oc debug node/jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal Starting pod/jechen-0127b-55qb8-worker-a-8m784copenshift-qeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.128.2 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# shutdown Shutdown scheduled for Thu 2022-01-27 16:06:15 UTC, use 'shutdown -c' to cancel. sh-4.4# Removing debug pod ... $ oc get hostsubnet NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS jechen-0127b-55qb8-master-0.c.openshift-qe.internal jechen-0127b-55qb8-master-0.c.openshift-qe.internal 10.0.0.6 10.128.0.0/23 jechen-0127b-55qb8-master-1.c.openshift-qe.internal jechen-0127b-55qb8-master-1.c.openshift-qe.internal 10.0.0.7 10.130.0.0/23 jechen-0127b-55qb8-master-2.c.openshift-qe.internal jechen-0127b-55qb8-master-2.c.openshift-qe.internal 10.0.0.5 10.129.0.0/23 jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal jechen-0127b-55qb8-worker-a-8m784.c.openshift-qe.internal 10.0.128.2 10.129.2.0/23 ["10.0.128.0/17"] jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal jechen-0127b-55qb8-worker-b-hzrfx.c.openshift-qe.internal 10.0.128.3 10.128.2.0/23 ["10.0.128.0/17"] ["10.0.128.100"] jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal jechen-0127b-55qb8-worker-c-c89pw.c.openshift-qe.internal 10.0.128.4 10.131.0.0/23 $ oc rsh test-rc-48s46 Error from server: error dialing backend: dial tcp 10.0.128.2:10250: i/o timeout $ oc rsh test-rc-5l6nc ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ ~ $ exit $ oc rsh test-rc-fz5l6 ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ curl 10.0.0.2:8888 10.0.128.100~ $ ~ $ exit Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |