|Summary:||Allow to increase a timeout value to accommodate a slower baremetal nodes in deployment|
|Product:||OpenShift Container Platform||Reporter:||Chris Janiszewski <cjanisze>|
|Component:||Installer||Assignee:||Pierre Prinetti <pprinett>|
|Installer sub component:||OpenShift on OpenStack||QA Contact:||David Sanz <dsanzmor>|
|Status:||CLOSED WONTFIX||Docs Contact:|
|Priority:||medium||CC:||akostadi, m.andre, nsatsia, pprinett, racedoro|
|Fixed In Version:||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|Last Closed:||2020-08-06 15:44:42 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Chris Janiszewski 2020-06-04 14:21:53 UTC
Comment 6 Martin André 2020-06-05 17:22:06 UTC
Just a note that BM deployment are using a 60 min timeout instead of 30 to accommodate with longer boot time for BM nodes, and are *still* hitting the timeout occasionally. https://github.com/openshift/installer/blob/3d6f27a/cmd/openshift-install/create.go#L348
Comment 8 Aleksandar Kostadinov 2020-06-15 14:04:56 UTC
Hello, while on it, could you make installation timeouts configurable using configuration file or environment variables or flags? I see that depending on installation type (even not bare metal) and underlying infrastructure properties, the installation can timeout. This was already discussed in bug 1819746 which resulted into a documentation change. But for QE performing many kinds of temporary installations on different infrastructure having manual steps in the process is not a viable option. On the other hand ignoring failed installer execution and checking cluster later in an automated fashion leaves gives room for false positives. Thank you.
Comment 10 Pierre Prinetti 2020-08-06 15:44:42 UTC
Unfortunately, we are not able to provide a solution in code at this stage for the installer's `wait-for install-complete`. We are documenting a workaround for attaching bare metal machines in this PR: https://github.com/openshift/installer/pull/3955 Day 2 operations should be covered by this patch, which increases the waiting time for CSRs to two hours: https://github.com/openshift/cluster-machine-approver/pull/37
Comment 11 Aleksandar Kostadinov 2020-08-06 18:58:22 UTC
> we are not able to provide a solution in code at this stage Pierre, could you clarify? Does it mean we drop the feature for current version or we plan to leave as is for the foreseeable future? In can we only drop feature for current version, how about keeping issue open and change target version?
Comment 12 Pierre Prinetti 2020-08-07 10:26:25 UTC
(In reply to Aleksandar Kostadinov from comment #11) > > we are not able to provide a solution in code at this stage > > Pierre, could you clarify? Does it mean we drop the feature for current > version or we plan to leave as is for the foreseeable future? > > In can we only drop feature for current version, how about keeping issue > open and change target version? The problem at hand ("the Installer timeout expires before installation is complete") has an easy workaround ("Just run `openshift-install wait-for install-complete` again"). I personally think that the best way forward would be to let the user customise the timeout duration, for example with a command-line flag. However, since this is a change to the Installer (as opposed to a platform-specific change), it requires a degree of coordination that is hard to obtain with a low-priority bug. We can have a discussion with the Installer team by treating the change as a feature, rather than a bug, for an upcoming release. The first step in this direction may be to open an issue, or a pull request, in github.com/openshift/enhancements.