Bug 1955622

Summary: 4.8-e2e-metal-assisted jobs: Timeout of 360 seconds expired waiting for Cluster to be in status ['installing', 'error']
Product: OpenShift Container Platform Reporter: Petr Muller <pmuller>
Component: assisted-installerAssignee: Osher De Paz <odepaz>
assisted-installer sub component: Installer QA Contact: Udi Kalifon <ukalifon>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: aos-bugs, yobshans
Version: 4.8   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AI-Team-Core
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: integrated oc client couldn't authenticate with an already authenticated registry Consequence: assisted service didn't proceed with installation, and regressed to a "ready" state from the "preparing-installation" state Fix: reverting the change in oc binary change Result: installation advances to successful completion
Story Points: ---
Clone Of: Environment:
job=periodic-ci-openshift-release-master-nightly-4.8-e2e-metal-assisted-ipv6=all job=periodic-ci-openshift-release-master-nightly-4.8-e2e-metal-assisted=all
Last Closed: 2021-07-27 23:05:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Petr Muller 2021-04-30 14:42:44 UTC
Both metal-assisted jobs started timing out their installs on Apr 29:

https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-nightly-4.8-e2e-metal-assisted-ipv6https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-nightly-4.8-e2e-metal-assisted

Example jobs:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.8-e2e-metal-assisted/1388098889189429248
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.8-e2e-metal-assisted-ipv6/1388098904997761024

waiting.exceptions.TimeoutExpired: Timeout of 360 seconds expired waiting for Cluster to be in status ['installing', 'error']
make: *** [Makefile:304: _deploy_nodes] Error 1
make: *** [Makefile:308: deploy_nodes_with_install] Error 2
{"component":"entrypoint","error":"wrapped process failed: exit status 2","file":"prow/entrypoint/run.go:80","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2021-04-30T12:28:28Z"}
error: failed to execute wrapped command: exit status 2

Comment 2 Yuri Obshansky 2021-05-20 12:39:58 UTC
@Petr Muller
Hey, could please review/verify and change status to verified the bug
Thank you

Comment 3 Petr Muller 2021-05-20 23:20:30 UTC
Isn't ON_QA state owned by QE?

Comment 4 Petr Muller 2021-05-20 23:21:13 UTC
As far as I'm concerned, this does not happen in CI jobs anymore

Comment 5 Yuri Obshansky 2021-05-21 12:18:22 UTC
Thank you @Petr Muller

Comment 8 errata-xmlrpc 2021-07-27 23:05:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438