Bug 2039099
Summary: | [OVN EgressIP GCP] After reboot egress node, egressip that was previously assigned got lost | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | jechen <jechen> |
Component: | Networking | Assignee: | Ben Bennett <bbennett> |
Networking sub component: | ovn-kubernetes | QA Contact: | jechen <jechen> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | bpickard, huirwang, zzhao |
Version: | 4.10 | ||
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:38:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
jechen
2022-01-10 23:53:03 UTC
@jechen assign this bug to you for verification this bug, thanks Verified in 4.10.0-0.nightly-2022-01-25-023600 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-25-023600 True False 12m Cluster version is 4.10.0-0.nightly-2022-01-25-023600 $ oc get node NAME STATUS ROLES AGE VERSION jechen-0125c-r6m4q-master-0.c.openshift-qe.internal Ready master 68m v1.23.0+06791f6 jechen-0125c-r6m4q-master-1.c.openshift-qe.internal Ready master 68m v1.23.0+06791f6 jechen-0125c-r6m4q-master-2.c.openshift-qe.internal Ready master 68m v1.23.0+06791f6 jechen-0125c-r6m4q-worker-a-6zf4c.c.openshift-qe.internal Ready worker 53m v1.23.0+06791f6 jechen-0125c-r6m4q-worker-b-ppgq4.c.openshift-qe.internal Ready worker 53m v1.23.0+06791f6 jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal Ready worker 53m v1.23.0+06791f6 $ oc label node jechen-0125c-r6m4q-worker-a-6zf4c.c.openshift-qe.internal "k8s.ovn.org/egress-assignable"="" node/jechen-0125c-r6m4q-worker-a-6zf4c.c.openshift-qe.internal labeled $ oc label node jechen-0125c-r6m4q-worker-b-ppgq4.c.openshift-qe.internal "k8s.ovn.org/egress-assignable"="" node/jechen-0125c-r6m4q-worker-b-ppgq4.c.openshift-qe.internal labeled $ oc label node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal "k8s.ovn.org/egress-assignable"="" node/jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal labeled $ oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red.yaml egressip.k8s.ovn.org/egressip1 created $ oc get egressip NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS egressip1 10.0.128.101 jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal 10.0.128.103 $ oc get egressip -oyaml apiVersion: v1 items: - apiVersion: k8s.ovn.org/v1 kind: EgressIP metadata: creationTimestamp: "2022-01-26T02:44:14Z" generation: 4 name: egressip1 resourceVersion: "42056" uid: c0d0c881-d566-4c31-b984-1b26854447a1 spec: egressIPs: - 10.0.128.101 - 10.0.128.102 - 10.0.128.103 namespaceSelector: matchLabels: team: red status: items: - egressIP: 10.0.128.103 node: jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal - egressIP: 10.0.128.102 node: jechen-0125c-r6m4q-worker-a-6zf4c.c.openshift-qe.internal - egressIP: 10.0.128.101 node: jechen-0125c-r6m4q-worker-b-ppgq4.c.openshift-qe.internal kind: List metadata: resourceVersion: "" selfLink: "" [jechen@jechen ~]$ oc new-project test $ oc label ns test team=red namespace/test labeled $ oc create -f ./SDN-1332-test/list_for_pods.json replicationcontroller/test-rc created service/test-service created $ oc get pod NAME READY STATUS RESTARTS AGE test-rc-749c9 0/1 ContainerCreating 0 2s test-rc-99lxj 0/1 ContainerCreating 0 2s test-rc-mx5zb 0/1 ContainerCreating 0 2s $ oc rsh test-rc-749c9 ~ $ curl 10.0.0.2:8888 10.0.128.101~ $ ~ $ curl 10.0.0.2:8888 10.0.128.101~ $ ~ $ curl 10.0.0.2:8888 10.0.128.101~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ exit $ oc debug node/jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal Starting pod/jechen-0125c-r6m4q-worker-c-b787bcopenshift-qeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.128.4 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# sh-4.4# reboot Terminated sh-4.4# Removing debug pod ... ###wait till the node comes back $ oc describe node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 14s kubelet Starting kubelet. Normal NodeHasSufficientMemory 14s (x2 over 14s) kubelet Node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 14s (x2 over 14s) kubelet Node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 14s (x2 over 14s) kubelet Node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal status is now: NodeHasSufficientPID Warning Rebooted 14s kubelet Node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal has been rebooted, boot id: d693e91e-272c-4420-9125-66931845e6e5 Normal NodeNotReady 14s kubelet Node jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal status is now: NodeNotReady Normal NodeAllocatableEnforced 14s kubelet Updated Node Allocatable limit across pods $ oc get egressip -oyaml apiVersion: v1 items: - apiVersion: k8s.ovn.org/v1 kind: EgressIP metadata: creationTimestamp: "2022-01-26T02:44:14Z" generation: 6 name: egressip1 resourceVersion: "44232" uid: c0d0c881-d566-4c31-b984-1b26854447a1 spec: egressIPs: - 10.0.128.101 - 10.0.128.102 - 10.0.128.103 namespaceSelector: matchLabels: team: red status: items: - egressIP: 10.0.128.102 node: jechen-0125c-r6m4q-worker-a-6zf4c.c.openshift-qe.internal - egressIP: 10.0.128.101 node: jechen-0125c-r6m4q-worker-b-ppgq4.c.openshift-qe.internal - egressIP: 10.0.128.103 node: jechen-0125c-r6m4q-worker-c-b787b.c.openshift-qe.internal kind: List metadata: resourceVersion: "" selfLink: "" $ oc rsh test-rc-749c9 ~ $ curl 10.0.0.2:8888 10.0.128.102~ $ ~ $ curl 10.0.0.2:8888 10.0.128.102~ $ ~ $ curl 10.0.0.2:8888 10.0.128.101~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ curl 10.0.0.2:8888 10.0.128.102~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ curl 10.0.0.2:8888 10.0.128.102~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ curl 10.0.0.2:8888 10.0.128.101~ $ ~ $ curl 10.0.0.2:8888 10.0.128.101~ $ ~ $ curl 10.0.0.2:8888 10.0.128.102~ $ ~ $ curl 10.0.0.2:8888 10.0.128.103~ $ ~ $ exit Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |