Bug 1707478

Summary: failed to initialize the cluster: waiting on console: timed out waiting for the condition
Product: OpenShift Container Platform Reporter: Samuel Padgett <spadgett>
Component: Management ConsoleAssignee: bpeterse
Status: CLOSED CURRENTRELEASE QA Contact: Yadan Pei <yapei>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, jokerman, mmccomas, yapei
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-10 11:56:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Samuel Padgett 2019-05-07 15:25:31 UTC
Console does not have the ClusterOperator resource last in its manifests, which can cause console install to occasionally fail.

Comment 1 Samuel Padgett 2019-05-07 15:26:37 UTC
https://github.com/openshift/console-operator/pull/226

Comment 3 Yadan Pei 2019-05-08 07:33:14 UTC
$ oc adm release info registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-08-012425 --commits  | grep console-operator
  console-operator                              https://github.com/openshift/console-operator                              324d5e1d70307aea834af6c8d691c134f4b6e58a
$ git log 324d5e1d70307aea834af6c8d691c134f4b6e58a | grep '#226'   // PR included in this commit
    Merge pull request #226 from benjaminapetersen/manifests/ordering
$ oc adm release extract --from=registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-08-012425 --to=payload-0508012425
Extracted release payload from digest sha256:bcdd49cffd3c1c1029336d4eda2feabf0e21e1a97a77a8b24911fb9d720b7d6b created at 2019-05-08T01:26:51Z
$ cd payload-0508012425/
$ ls -l 0000_50_console-operator*
-rw-r-----. 1 yapei yapei 1272 May  8 02:59 0000_50_console-operator_00-crd-operator-config.yaml
-rw-r-----. 1 yapei yapei  187 May  8 02:59 0000_50_console-operator_01-oauth.yaml
-rw-r-----. 1 yapei yapei  170 May  8 02:59 0000_50_console-operator_01-operator-config.yaml
-rw-r-----. 1 yapei yapei  247 May  8 02:59 0000_50_console-operator_02-namespace.yaml
-rw-r-----. 1 yapei yapei  704 May  8 02:59 0000_50_console-operator_03-rbac-role-cluster.yaml
-rw-r-----. 1 yapei yapei  923 May  8 02:59 0000_50_console-operator_03-rbac-role-ns-console.yaml
-rw-r-----. 1 yapei yapei  555 May  8 02:59 0000_50_console-operator_03-rbac-role-ns-openshift-config-managed.yaml
-rw-r-----. 1 yapei yapei  453 May  8 02:59 0000_50_console-operator_03-rbac-role-ns-operator.yaml
-rw-r-----. 1 yapei yapei 1523 May  8 02:59 0000_50_console-operator_04-rbac-rolebinding.yaml
-rw-r-----. 1 yapei yapei  164 May  8 02:59 0000_50_console-operator_05-config-public.yaml
-rw-r-----. 1 yapei yapei  371 May  8 02:59 0000_50_console-operator_05-config.yaml
-rw-r-----. 1 yapei yapei  111 May  8 02:59 0000_50_console-operator_06-sa.yaml
-rw-r-----. 1 yapei yapei 3265 May  8 02:59 0000_50_console-operator_07-downloads-deployment.yaml
-rw-r-----. 1 yapei yapei  236 May  8 02:59 0000_50_console-operator_07-downloads-route.yaml
-rw-r-----. 1 yapei yapei  224 May  8 02:59 0000_50_console-operator_07-downloads-service.yaml
-rw-r-----. 1 yapei yapei 2015 May  8 02:59 0000_50_console-operator_07-operator.yaml
-rw-r-----. 1 yapei yapei  184 May  8 02:59 0000_50_console-operator_08-clusteroperator.yaml

0000_50_console-operator_08-clusteroperator.yaml is the last resource to be created. 

Since it's a random failure, I'm afraid I can't say it's fixed even a cluster is created successfully, so I tried to check the new sequence of resources defined in payload, Sam, is this enough to verify the fix?

Comment 4 Yadan Pei 2019-05-10 03:29:10 UTC
Move to VERIFIED based on QE estimation, feel free to ask for further verification if needed

Comment 5 Samuel Padgett 2019-05-10 11:56:13 UTC
(In reply to Yadan Pei from comment #3)
> 
> Since it's a random failure, I'm afraid I can't say it's fixed even a
> cluster is created successfully, so I tried to check the new sequence of
> resources defined in payload, Sam, is this enough to verify the fix?

I think that's reasonable. We also have not seen reports of this in CI since the fix merged.