Hide Forgot
Created attachment 1675399 [details] flexy-console.log Description of problem: I am using flexy installer to deploy OCP-4.4 on RHOS. I can see that cluster seems to be healthy and functional but installer fails on timeout anyway. I am adding snippets of logs into description bellow. And full logs as attachments and in additional info. 11:29:15 level=info msg="API v1.17.1 up" 11:29:15 level=info msg="Waiting up to 40m0s for bootstrapping to complete..." 12:09:23 level=info msg="Cluster operator insights Disabled is False with : " .... TRIMMED .... 12:09:57 level=debug msg="Log bundle written to /var/home/core/log-bundle-20200401100930.tar.gz" 12:09:57 level=info msg="Bootstrap gather logs captured here \"install-dir/log-bundle-20200401100930.tar.gz\"" 12:09:57 level=fatal msg="Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition" 12:09:57 tools/launch_instance.rb:623:in `installation_task': shell command failed execution, see logs (RuntimeError) [fedora@flexy-executor-2 private-flexy-example]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.0-rc.4 True False 90m Cluster version is 4.4.0-rc.4 [fedora@flexy-executor-2 private-flexy-example]$ oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.4.0-rc.4 True False False 90m cloud-credential 4.4.0-rc.4 True False False 108m cluster-autoscaler 4.4.0-rc.4 True False False 98m console 4.4.0-rc.4 True False False 92m csi-snapshot-controller 4.4.0-rc.4 True False False 96m dns 4.4.0-rc.4 True False False 103m etcd 4.4.0-rc.4 True False False 103m image-registry 4.4.0-rc.4 True False False 96m ingress 4.4.0-rc.4 True False False 96m insights 4.4.0-rc.4 True False False 100m kube-apiserver 4.4.0-rc.4 True False False 102m kube-controller-manager 4.4.0-rc.4 True False False 101m kube-scheduler 4.4.0-rc.4 True False False 102m kube-storage-version-migrator 4.4.0-rc.4 True False False 96m machine-api 4.4.0-rc.4 True False False 104m machine-config 4.4.0-rc.4 True False False 103m marketplace 4.4.0-rc.4 True False False 100m monitoring 4.4.0-rc.4 True False False 94m network 4.4.0-rc.4 True False False 103m node-tuning 4.4.0-rc.4 True False False 104m openshift-apiserver 4.4.0-rc.4 True False False 96m openshift-controller-manager 4.4.0-rc.4 True False False 99m openshift-samples 4.4.0-rc.4 True False False 97m operator-lifecycle-manager 4.4.0-rc.4 True False False 104m operator-lifecycle-manager-catalog 4.4.0-rc.4 True False False 104m operator-lifecycle-manager-packageserver 4.4.0-rc.4 True False False 99m service-ca 4.4.0-rc.4 True False False 104m service-catalog-apiserver 4.4.0-rc.4 True False False 104m service-catalog-controller-manager 4.4.0-rc.4 True False False 104m storage 4.4.0-rc.4 True False False 100m Version-Release number of the following components: 4.4.0-rc.4 How reproducible: 100% Additional info: http://file.rdu.redhat.com/lbednar/log-bundle-20200401100930.tar.gz http://file.rdu.redhat.com/lbednar/must-gather.local.9180392784441685071.tag.gz
I was able to reproduce on 4.4.0-rc.8 as well.
The reported debug message "Cluster operator insights Disabled is False with :" does not seem related to the problem. Depending on the infrastructure performance, the installation takes longer than the global timeout of the installer; the cluster still converges to a healthy state independently from the installer itself.
The Github PR refers to a new "known issues" section dedicated to this problem.
Verified as it is a doc adding
(In reply to szacks from comment #9) > By adding it as a known problem and marking it as Verified, does that mean > that you are not planning on fixing it by allowing a timeout parameter (for > example)? The timeout logic is defined at the orchestration level: a functional area that goes beyond the scope of OpenShift-on-OpenStack. Making changes at that level required coordination and perseverance. However, we feel the timeout problem and we plan to tackle it in the context of our upcoming "baremetal workers" epic (which is planned for 4.6).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409