Description of problem: Not able to bring up sdn or ovn clusters on vpshere. Suspecting installer issue as i noticed the installer now waits 20 minutes for kube api to get ready might not be giving enough time as opposed to earlier releases 30 min time. Is it intentional on 4.4 or an error? level=info msg="Waiting up to 20m0s for the Kubernetes API at https://api.qe-anusaxen-vs1.qe.devcluster.openshift.com:6443..." level=error msg="Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get https://api.qe-anusaxen-vs1.qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusteroperators: dial tcp 139.178.76.10:6443: i/o timeout" level=info msg="Use the following commands to gather logs from the cluster" level=info msg="openshift-install gather bootstrap --help" level=fatal msg="waiting for Kubernetes API: context deadline exceeded" + exit 3 Version-Release number of the following components:4.4.0-0.nightly-2020-02-18-200822 How reproducible: Always Steps to Reproduce: 1.Bring up OCP cluster on vsphere 2. 3. Actual results: Expected results: Cluster should come up fine on Vsphere env Additional info:
> Suspecting installer issue as i noticed the installer now waits 20 minutes for kube api to get ready might not be giving enough time as opposed to earlier releases 30 min time. Is it intentional on 4.4 or an error? This is intentional in 4.4 > level=info msg="Use the following commands to gather logs from the cluster" > level=info msg="openshift-install gather bootstrap --help" > level=fatal msg="waiting for Kubernetes API: context deadline exceeded" Debugging requires that you attach the log bundle as requested by the installer.
Thanks for confirming. I will try to gather more logs on this.
@Abhinav, I can share the bootstrap node IP with you to look at. Please ping me when you are in. Thanks
Thanks @Joseph for refering the PR. I will discuss this with installer QE team to find out more.
We hit it 2 days ago in qe's ci test, the failure is caused by another known issue https://bugzilla.redhat.com/show_bug.cgi?id=1804032. Not installer issue.
depends on Bug 1798945 as discussed with Jainlin/Jia from installer team. etcd operator issue
Joseph, target version should be 4.4?
Was the close a mistake? What is the current status? I have installed OCP 4.4 on vSphere w/UPI no problems (after the etcd operator issue was resolved) CI [0] has flakes (not installer related) and passing at 50%. [0] - https://prow.svc.ci.openshift.org/?job=*vsphere*4.4
(Yep, close was a mistake) Joseph, The root cause seems to be the broken boot rhcos image(rhcos-44.81.202002071430-0), which is being tracked in Bug 1804032. And if we use another old boot rhcos image(such as rhcos-44.81.202001241431.0), then we will hit another known issue https://bugzilla.redhat.com/show_bug.cgi?id=1798945#c8 So apparently an etcd+RHCOS component issue, not installer
Why do we have this BZ when the issue is with components other than the installer? Is there an issue with vSphere UPI that I can help with? If not this really should be closed.
Joseph. Yes, we can change this to "closed Duplicate of 1798945" thats what i am hitting.
*** This bug has been marked as a duplicate of bug 1798945 ***