Bug 1870183
Summary: | [vSphere]: Connection to server refused during installation of OCP 4.6 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Vijay Avuthu <vavuthu> |
Component: | Installer | Assignee: | aos-install |
Installer sub component: | openshift-installer | QA Contact: | Gaoyun Pei <gpei> |
Status: | CLOSED DUPLICATE | Docs Contact: | |
Severity: | high | ||
Priority: | medium | CC: | adahiya, aos-bugs, kewang, mfojtik, sbatsche, sttts, vavuthu, xxia |
Version: | 4.6 | ||
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-09-10 22:14:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Vijay Avuthu
2020-08-19 13:38:17 UTC
[core@control-plane-0 ~]$ sudo crictl ps -a | grep kube-apiserver 5ef48d79eea3b e4d2c0a1679ffb86b584f3563ceb45d8ce5b4fe01af5faef3ac1bf0f4ce474c1 7 hours ago Running kube-apiserver-operator 2 fbcab7e08a07a 5cb454dd7dbf8 e4d2c0a1679ffb86b584f3563ceb45d8ce5b4fe01af5faef3ac1bf0f4ce474c1 7 hours ago Running kube-apiserver-check-endpoints 0 ceb051274a960 6c3189fde99f9 e4d2c0a1679ffb86b584f3563ceb45d8ce5b4fe01af5faef3ac1bf0f4ce474c1 7 hours ago Running kube-apiserver-insecure-readyz 0 ceb051274a960 7b587eaa35336 e4d2c0a1679ffb86b584f3563ceb45d8ce5b4fe01af5faef3ac1bf0f4ce474c1 7 hours ago Running kube-apiserver-cert-regeneration-controller 0 ceb051274a960 9eae13840c3f3 e4d2c0a1679ffb86b584f3563ceb45d8ce5b4fe01af5faef3ac1bf0f4ce474c1 7 hours ago Running kube-apiserver-cert-syncer 0 ceb051274a960 e857e4f32a133 805e2144af41b2f76f4c5fd8f8eac33a7cb16357cfddca7d3c6f6c23bd3bf9eb 7 hours ago Running kube-apiserver 0 ceb051274a960 2b7b912fe78a4 e4d2c0a1679ffb86b584f3563ceb45d8ce5b4fe01af5faef3ac1bf0f4ce474c1 7 hours ago Exited kube-apiserver-operator 1 fbcab7e08a07a [core@control-plane-0 ~]$ > errors in kube-apiserver logs W0819 06:51:39.178529 18 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {https://10.1.160.27:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Err or while dialing dial tcp 10.1.160.27:2379: connect: connection refused". Reconnecting... I0819 06:51:39.178570 18 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc0010e7b20, {TRANSIENT_FAILURE connection error: desc = "transport: Error while dia ling dial tcp 10.1.160.27:2379: connect: connection refused"} I0819 06:51:39.178720 18 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc000660980, {CONNECTING <nil>} W0819 06:51:39.178824 18 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {https://localhost:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp [::1]:2379: connect: connection refused". Reconnecting... I0819 06:51:39.178882 18 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc000660980, {TRANSIENT_FAILURE connection error: desc = "transport: Error while dia ling dial tcp [::1]:2379: connect: connection refused"} W0819 06:51:39.188268 18 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {https://10.1.160.27:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Err or while dialing dial tcp 10.1.160.27:2379: connect: connection refused". Reconnecting... > kube-apiserver and kube-apiserver-cert-syncer logs are uploaded to http://rhsqe-repo.lab.eng.blr.redhat.com/ocs4qe/vavuthu/bug1870183/ > After bootstrapping is completed and node is removed, getting csr is giving connection refused error
After the bootstraping is finished, the installer is not really involved in keeping the api running, so moving to api server team to triage why api server is not running.
The connection refused part of the issue is addressed in https://github.com/openshift/installer/pull/4012. The root cause is most probably etcd, triggering the haproxy issue fixed in that PR. > 4.6.0-0.nightly-2020-08-18-165040
We had some performance issues with 4.6 CI nightly around this time which were resolved in more recent builds can you please try with more recent nighly and let us know if problem still exists?
Also we will need access to the cluser or log-bundle to debug.
$ openshift-install gather bootstrap --bootstrap $BOOTSTRAP_IP --master MASTER0_IP --master MASTER1_IP --master MASTER2_IP
Based on Comment 5 this looks like this will be fixed by moving the installer to /readyz for vSphere UPI *** This bug has been marked as a duplicate of bug 1836017 *** |