Bug 2043738
| Summary: | SNO Installation on 4.10-fc.2 Does Not Complete | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Benjamin Schmaus <bschmaus> | ||||
| Component: | Networking | Assignee: | Ben Nemec <bnemec> | ||||
| Networking sub component: | runtime-cfg | QA Contact: | Victor Voronkov <vvoronko> | ||||
| Status: | CLOSED NOTABUG | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | unspecified | CC: | aos-bugs, calfonso, ercohen, htariq, rfreiman | ||||
| Version: | 4.10 | ||||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-02-02 21:02:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
kube-apiserver log:
W0121 20:46:42.177289 18 patch_genericapiserver.go:130] Request to "/apis/rbac.authorization.k8s.io/v1/namespaces/openshift-authentication-operator/rolebindings" (source IP 10.128.0.38:44152, user agent "openshift-controller-manager/v0.0.0 (linux/am
d64) kubernetes/$Format/system:serviceaccount:openshift-infra:default-rolebindings-controller") before server is ready, possibly a sign for a broken load balancer setup.
W0121 20:46:42.186751 18 patch_genericapiserver.go:130] Request to "/apis/apps/v1/namespaces/openshift-monitoring/deployments/prometheus-adapter" (source IP 10.128.0.20:52770, user agent "Go-http-client/2.0") before server is ready, possibly a sign
for a broken load balancer setup.
W0121 20:46:42.197858 18 patch_genericapiserver.go:130] Request to "/apis/monitoring.coreos.com/v1" (source IP 10.128.0.41:45058, user agent "ingress-operator/v0.0.0 (linux/amd64) kubernetes/$Format") before server is ready, possibly a sign for a br
oken load balancer setup.
W0121 20:46:42.199464 18 patch_genericapiserver.go:130] Request to "/api/v1/namespaces/openshift-kube-controller-manager/configmaps/config" (source IP 10.128.0.13:37526, user agent "Go-http-client/2.0") before server is ready, possibly a sign for a
broken load balancer setup.
W0121 20:46:42.211012 18 patch_genericapiserver.go:130] Request to "/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations" (source IP 10.128.0.46:56426, user agent "olm/v0.0.0 (linux/amd64) kubernetes/$Format") before server is ready,
possibly a sign for a broken load balancer setup.
W0121 20:46:42.212543 18 patch_genericapiserver.go:130] Request to "/apis/config.openshift.io/v1/clusteroperators/marketplace" (source IP 10.128.0.6:50544, user agent "marketplace-operator/v0.0.0 (linux/amd64) kubernetes/$Format") before server is r
eady, possibly a sign for a broken load balancer setup.
W0121 20:46:42.212837 18 patch_genericapiserver.go:130] Request to "/apis/operators.coreos.com/v1/namespaces/openshift-operator-lifecycle-manager/operatorgroups" (source IP 10.128.0.46:56426, user agent "olm/v0.0.0 (linux/amd64) kubernetes/$Format")
before server is ready, possibly a sign for a broken load balancer setup.
W0121 20:46:42.212889 18 patch_genericapiserver.go:130] Request to "/apis/config.openshift.io/v1/clusteroperators/operator-lifecycle-manager-packageserver" (source IP 10.128.0.46:56426, user agent "olm/v0.0.0 (linux/amd64) kubernetes/$Format") befor
e server is ready, possibly a sign for a broken load balancer setup.
I0121 20:46:42.213916 18 genericapiserver.go:812] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-r740", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'KubeAPIRea
dyz' readyz=true
W0121 20:46:42.549578 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
W0121 20:46:43.348519 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
W0121 20:46:43.408224 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
W0121 20:46:43.501701 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
W0121 20:46:44.382918 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
W0121 20:46:44.680172 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
W0121 20:46:45.436591 18 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {10.11.176.112:2379 10.11.176.112 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate is
valid for ::1, 10.11.176.230, 127.0.0.1, ::1, not 10.11.176.112". Reconnecting...
Seems that the kube-apiserver is failing to communicate with etcd.
|
Created attachment 1852619 [details] Log Bundle Thanks for opening a bug report! Before hitting the button, please fill in as much of the template below as you can. If you leave out information, it's harder to help you. Be ready for follow-up questions, and please respond in a timely manner. If we can't reproduce a bug we might close your issue. If we're wrong, PLEASE feel free to reopen it and explain why. Version: $ openshift-install version 4.10-fc.2 Platform: Baremetal SNO Please specify: IPI What happened? Installation fails to complete - many cluster operators seem hinder to progress Attached log bundle What did you expect to happen? Installation to complete How to reproduce it (as minimally and precisely as possible)? Use AI or Platform None to create a bootable ISO that will create SNO node on baremetal Anything else we need to know? oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-fc.2 False False True 43m APIServerDeploymentAvailable: no apiserver.openshift-oauth-apiserver pods available on any node.... baremetal 4.10.0-fc.2 True False False 35m cloud-controller-manager 4.10.0-fc.2 True False False 35m cloud-credential 4.10.0-fc.2 True False False 43m cluster-autoscaler 4.10.0-fc.2 True False False 35m config-operator 4.10.0-fc.2 True False False 44m console csi-snapshot-controller 4.10.0-fc.2 True False False 43m dns 4.10.0-fc.2 True False False 35m etcd 4.10.0-fc.2 True False True 33m StaticPodsDegraded: pod/etcd-r740 container "etcd-health-monitor" is waiting: CrashLoopBackOff: back-off 5m0s restarting failed container=etcd-health-monitor pod=etcd-r740_openshift-etcd(226fae129393e3068efeb4a516de88f9) image-registry ingress Unknown True Unknown 35m Not all ingress controllers are available. insights 4.10.0-fc.2 True False False 30m kube-apiserver 4.10.0-fc.2 True False False 29m kube-controller-manager 4.10.0-fc.2 True False False 30m kube-scheduler 4.10.0-fc.2 True False False 33m kube-storage-version-migrator 4.10.0-fc.2 True False False 43m machine-api 4.10.0-fc.2 True False False 35m machine-approver machine-config True True True 33m Unable to apply 4.10.0-fc.2: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: configuration status for pool master is empty: 0 (ready 0) out of 1 nodes are updating to latest configuration rendered-master-8096224d6931030961556cb3ce316af7, retrying marketplace 4.10.0-fc.2 True False False 43m monitoring False False True 42m Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error. network 4.10.0-fc.2 True False False 44m node-tuning 4.10.0-fc.2 True False False 43m openshift-apiserver 4.10.0-fc.2 False False True 43m APIServerDeploymentAvailable: no apiserver.openshift-apiserver pods available on any node.... openshift-controller-manager 4.10.0-fc.2 True False False 29m openshift-samples operator-lifecycle-manager 4.10.0-fc.2 True False False 35m operator-lifecycle-manager-catalog 4.10.0-fc.2 True False False 35m operator-lifecycle-manager-packageserver 4.10.0-fc.2 True False False 30m service-ca 4.10.0-fc.2 True False False 43m storage 4.10.0-fc.2 True False False 35m