Cause: Due to the performance variability of the OpenStack clouds where OpenShift can be installed, the installation times can be unpredictable.
Consequence: The installer might time out even when the installation would converge to a working state, over time.
Workaround (if any): Waiting even after the installation is failed, and check the cluster. It might be perfectly healthy.
Result: The cluster might reach a perfectly healthy state, even after the installer timeout.
The reported debug message "Cluster operator insights Disabled is False with :" does not seem related to the problem.
Depending on the infrastructure performance, the installation takes longer than the global timeout of the installer; the cluster still converges to a healthy state independently from the installer itself.
(In reply to szacks from comment #9)
> By adding it as a known problem and marking it as Verified, does that mean
> that you are not planning on fixing it by allowing a timeout parameter (for
> example)?
The timeout logic is defined at the orchestration level: a functional area that goes beyond the scope of OpenShift-on-OpenStack. Making changes at that level required coordination and perseverance.
However, we feel the timeout problem and we plan to tackle it in the context of our upcoming "baremetal workers" epic (which is planned for 4.6).
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2020:2409