1966488 – [master] Assisted Service operator's controllers are starting before the base service is ready

Bug 1966488 - [master] Assisted Service operator's controllers are starting before the base service is ready

Summary: [master] Assisted Service operator's controllers are starting before the base...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	assisted-installer
Sub Component:
Version:	4.8
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Rom Freiman
QA Contact:	Omri Hochman
Docs Contact:
URL:
Whiteboard:	AI-Team-Hive
Depends On:
Blocks:	1968455 1969410 1971300
TreeView+	depends on / blocked

Reported:	2021-06-01 09:23 UTC by Osher De Paz
Modified:	2022-08-28 08:45 UTC (History)
CC List:	3 users (show)
Fixed In Version:	OCP-Metal-v1.0.22.1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1968455 1969410 (view as bug list)
Environment:
Last Closed:	2022-08-28 08:45:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Osher De Paz 2021-06-01 09:23:47 UTC

Description of problem:
The process of starting controllers in the assisted-service operator doesn't make into account the current state of backend REST API service.
This makes it start before we've uploaded rhcos base image, or before we've validated the pull-secret for auxiliary images (controller, agent, installer).

How reproducible:
REST API preparation might be slow or fast, and we might make use of the operator in different timings.
But pull-secret validation for example is something that fails consistently, and might be a good candidate to replicate this bug.


Steps to Reproduce:
1. Start installation of the operator, using images not from ordinary registries (not from quay.io, redhat.registry.io, etc.)
2. be aware of when the assisted-service pod actually becomes ready
3. apply cluster deployment with all the objects
4. look on the agentclusterinstall conditions. They will not be resolved yet

Actual results:
kube-api interface is ready, REST API is not.

Expected results:
kube-api interface not being ready until REST API is

Additional info:
This failure as an example
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_assisted-service/1811/pull-ci-openshift-assisted-service-master-e2e-metal-assisted-operator-ztp/1399582336479137792

Comment 2 Michael Filanov 2021-06-01 09:28:41 UTC

In main, we have `ApiEnabler`, that enable the api after we finished uploading rhos images, suggesting to add another function to it, IsEnabled and start the operators only after the api is ready.

Comment 3 Raz Regev 2021-06-09 09:56:19 UTC

controller will now wait for REST api to be ready
merged to master, commit 7cb0e2eb5a30fee6a221f22ccba0a082492e578e

Note You need to log in before you can comment on or make changes to this bug.