Bug 2039656
| Summary: | [EgressIP] Configuring EgressIPs on master nodes caused etcd Degraded | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | huirwang |
| Component: | Networking | Assignee: | Patryk Diak <pdiak> |
| Networking sub component: | openshift-sdn | QA Contact: | huirwang |
| Status: | CLOSED DEFERRED | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | anusaxen, bpickard, dbrahane, jboxman, jechen, mifiedle, prubenda, wking |
| Version: | 4.10 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-02-21 08:58:45 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I've created a draft PR[0] that adds this known issue to the release notes. [0] https://github.com/openshift/openshift-docs/pull/40711 Removing TestBlocker. @ @huirwang Let me know if you disagree with removing TestBlocker. This is a documented limitation and should be addressed in a future release *** Bug 2050403 has been marked as a duplicate of this bug. *** |
Description of problem: Tested on AWS sdn cluster, configuring EgressIPs on master nodes caused etcd Degraded. Version-Release number of selected component (if applicable): 4.10.0-0.nightly-2022-01-11-065245 How reproducible: Steps to Reproduce: $ oc get hostsubnet NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS ip-10-0-128-161.us-west-2.compute.internal ip-10-0-128-161.us-west-2.compute.internal 10.0.128.161 10.128.0.0/23 ["10.0.128.100"] ip-10-0-129-201.us-west-2.compute.internal ip-10-0-129-201.us-west-2.compute.internal 10.0.129.201 10.129.2.0/23 [] ip-10-0-136-54.us-west-2.compute.internal ip-10-0-136-54.us-west-2.compute.internal 10.0.136.54 10.131.0.0/23 [] ip-10-0-177-232.us-west-2.compute.internal ip-10-0-177-232.us-west-2.compute.internal 10.0.177.232 10.129.0.0/23 ["10.0.177.100"] ip-10-0-238-94.us-west-2.compute.internal ip-10-0-238-94.us-west-2.compute.internal 10.0.238.94 10.130.0.0/23 ["10.0.238.100"] ip-10-0-239-251.us-west-2.compute.internal ip-10-0-239-251.us-west-2.compute.internal 10.0.239.251 10.128.2.0/23 [] $ oc get netnamespace test NAME NETID EGRESS IPS test 5166387 ["10.0.128.100","10.0.238.100","10.0.177.100"] $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-128-161.us-west-2.compute.internal Ready master 75m v1.22.1+6859754 ip-10-0-129-201.us-west-2.compute.internal Ready worker 65m v1.22.1+6859754 ip-10-0-136-54.us-west-2.compute.internal Ready worker 67m v1.22.1+6859754 ip-10-0-177-232.us-west-2.compute.internal Ready master 75m v1.22.1+6859754 ip-10-0-238-94.us-west-2.compute.internal Ready master 73m v1.22.1+6859754 ip-10-0-239-251.us-west-2.compute.internal Ready worker 65m v1.22.1+6859754 $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-0.nightly-2022-01-11-065245 True False False 50m baremetal 4.10.0-0.nightly-2022-01-11-065245 True False False 66m cloud-controller-manager 4.10.0-0.nightly-2022-01-11-065245 True False False 68m cloud-credential 4.10.0-0.nightly-2022-01-11-065245 True False False 68m cluster-autoscaler 4.10.0-0.nightly-2022-01-11-065245 True False False 66m config-operator 4.10.0-0.nightly-2022-01-11-065245 True False False 67m console 4.10.0-0.nightly-2022-01-11-065245 True False False 54m csi-snapshot-controller 4.10.0-0.nightly-2022-01-11-065245 True False False 67m dns 4.10.0-0.nightly-2022-01-11-065245 True False False 66m etcd 4.10.0-0.nightly-2022-01-11-065245 True False True 65m EtcdCertSignerControllerDegraded: [x509: certificate is valid for 10.0.128.161, not 10.0.128.100, x509: certificate is valid for ::1, 10.0.128.161, 127.0.0.1, ::1, not 10.0.128.100] image-registry 4.10.0-0.nightly-2022-01-11-065245 True False False 59m ingress 4.10.0-0.nightly-2022-01-11-065245 True False False 58m insights 4.10.0-0.nightly-2022-01-11-065245 True False False 61m kube-apiserver 4.10.0-0.nightly-2022-01-11-065245 True False False 61m kube-controller-manager 4.10.0-0.nightly-2022-01-11-065245 True False False 65m kube-scheduler 4.10.0-0.nightly-2022-01-11-065245 True False False 65m kube-storage-version-migrator 4.10.0-0.nightly-2022-01-11-065245 True False False 67m machine-api 4.10.0-0.nightly-2022-01-11-065245 True False False 62m machine-approver 4.10.0-0.nightly-2022-01-11-065245 True False False 66m machine-config 4.10.0-0.nightly-2022-01-11-065245 True False False 65m marketplace 4.10.0-0.nightly-2022-01-11-065245 True False False 66m monitoring 4.10.0-0.nightly-2022-01-11-065245 True False False 57m network 4.10.0-0.nightly-2022-01-11-065245 True False False 68m node-tuning 4.10.0-0.nightly-2022-01-11-065245 True False False 66m openshift-apiserver 4.10.0-0.nightly-2022-01-11-065245 True False False 60m openshift-controller-manager 4.10.0-0.nightly-2022-01-11-065245 True False False 59m openshift-samples 4.10.0-0.nightly-2022-01-11-065245 True False False 59m operator-lifecycle-manager 4.10.0-0.nightly-2022-01-11-065245 True False False 66m operator-lifecycle-manager-catalog 4.10.0-0.nightly-2022-01-11-065245 True False False 66m operator-lifecycle-manager-packageserver 4.10.0-0.nightly-2022-01-11-065245 True False False 60m service-ca 4.10.0-0.nightly-2022-01-11-065245 True False False 67m storage 4.10.0-0.nightly-2022-01-11-065245 True False False 66m $ oc get co etcd -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: exclude.release.openshift.io/internal-openshift-hosted: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2022-01-12T06:18:07Z" generation: 1 name: etcd ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 54a22c6d-0e41-4cb9-9e76-7c74e0a87ace resourceVersion: "46701" uid: 42d0b9d0-a75e-4c94-859d-0f944a02bbd9 spec: {} status: conditions: - lastTransitionTime: "2022-01-12T06:20:40Z" reason: ControllerStarted status: Unknown type: RecentBackup - lastTransitionTime: "2022-01-12T07:25:51Z" message: 'EtcdCertSignerControllerDegraded: [x509: certificate is valid for 10.0.177.232, not 10.0.177.100, x509: certificate is valid for ::1, 10.0.177.232, 127.0.0.1, ::1, not 10.0.177.100, x509: certificate is valid for 10.0.128.161, not 10.0.128.100, x509: certificate is valid for ::1, 10.0.128.161, 127.0.0.1, ::1, not 10.0.128.100, x509: certificate is valid for 10.0.238.94, not 10.0.238.100, x509: certificate is valid for ::1, 10.0.238.94, 127.0.0.1, ::1, not 10.0.238.100]' reason: EtcdCertSignerController_Error status: "True" type: Degraded - lastTransitionTime: "2022-01-12T06:31:49Z" message: |- NodeInstallerProgressing: 3 nodes are at revision 6 EtcdMembersProgressing: No unstarted etcd members found reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2022-01-12T06:22:41Z" message: |- StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 6 EtcdMembersAvailable: 3 members are available reason: AsExpected status: "True" type: Available - lastTransitionTime: "2022-01-12T06:20:40Z" message: All is well reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: etcds - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-etcd-operator resource: namespaces - group: "" name: openshift-etcd resource: namespaces versions: - name: raw-internal version: 4.10.0-0.nightly-2022-01-11-065245 - name: etcd version: 4.10.0-0.nightly-2022-01-11-065245 - name: operator version: 4.10.0-0.nightly-2022-01-11-065245 Actual results: Expected results: Should not cause etcd downgrade. Additional info: