Bug 2055415 - Possible race condition in console operator managed cluster sync
Summary: Possible race condition in console operator managed cluster sync
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Jon Jackson
QA Contact: Yadan Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-16 21:14 UTC by Jon Jackson
Modified: 2022-11-22 18:28 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-22 18:28:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OCPBUGS-4008 0 None None None 2022-11-22 18:28:23 UTC

Description Jon Jackson 2022-02-16 21:14:09 UTC
Description of problem:
There is a possible race condition in the console operator where the managed cluster config gets updated after the console deployment and doesn't trigger a rollout. 

Version-Release number of selected component (if applicable):
4.10+

How reproducible:
Rarely

Steps to Reproduce:
1. Enable multicluster tech preview by adding TechPreviewNoUpgrade featureSet to FeatureGate config. (NOTE THIS ACTION IS IRREVERSIBLE AND WILL MAKE THE CLUSTER UNUPGRADEABLE AND UNSUPPORTED) 
2. Install ACM 2.5+
3. Import a managed cluster using either the ACM console or the CLI
4. Once that managed cluster is showing in the cluster dropdown, import a second managed cluster

Actual results:
Sometimes the second managed cluster will never show up in the cluster dropdown

Expected results:
The second managed cluster eventually shows up in the cluster dropdown after a page refresh


Additional info:
Work around is to delete the console pod to force a rollout. I suspect the problem is that sometimes the deployment rolls out a new pod before the managed cluster config has been updated, so the new cluster config doesn't get parsed. The subsequent update to the managed cluster config map doesn't trigger another rollout, and so the new config never gets consumed until a rollout is forced.

Comment 1 Jakub Hadvig 2022-02-17 11:34:38 UTC
Jon I have a feeling that we are not picking up the configmap informer's event that the managed-clusters.yaml CM was updated.
I see that we are filtering based on the label https://github.com/openshift/console-operator/blob/master/pkg/console/operator/operator.go#L152
wondering if that's not causing race.

Comment 2 Kevin Cormier 2022-03-25 21:34:43 UTC
I encountered a problem today where a managed cluster was not added to the cluster switcher. @jonjacks believes it was an instance of this issue, and he was able to resolve the problem by forcing deployment of new console pods.

Comment 3 Jon Jackson 2022-11-22 18:28:24 UTC
Migrated to Jira: https://issues.redhat.com/browse/OCPBUGS-4008


Note You need to log in before you can comment on or make changes to this bug.