Hide Forgot
Created attachment 1875388 [details] ManagedCluster, ManagedClusterInfo, ClusterCurator Description of the problem: I don't see cluster update option for the local-cluster, and while manually going to openshift console and applying update, the status get stuck in acm gui. Release version: 2.4.3 Operator snapshot version: OCP version: 4.10.11 Browser Info: FF latest on F35 Steps to reproduce: 1. go to acm clusters 2. monitor the status of local cluster 3. Actual results: while update was available in cluster console, acm doesn't show it. ACM shows available update for the remote cluster, which was also the same version 4.10.9 Expected results: There should be update button. And once monitoring the manually triggered update, the state should say up to date instead of "progressing 84%" Additional info: Chat here internally: https://chat.google.com/room/AAAAWskU424/njKNFzKsDko with screenshots also see stuff.yml for requested resources.
logs from clustrer-curator-controller: bin/sh: ./cluster-curator-controller: No such file or directory I0426 21:36:32.633030 1 request.go:665] Waited for 1.600608866s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operators.coreos.com/v1alpha2?timeout=32s 2022-04-26T21:36:35.436Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"} 2022-04-26T21:36:36.232Z INFO setup starting manager 2022-04-26T21:36:36.531Z INFO starting metrics server {"path": "/metrics"} I0426 21:36:36.532049 1 leaderelection.go:248] attempting to acquire leader lease open-cluster-management/d362c584.cluster.open-cluster-management.io... I0426 23:01:06.832334 1 leaderelection.go:258] successfully acquired lease open-cluster-management/d362c584.cluster.open-cluster-management.io 2022-04-26T23:01:06.832Z DEBUG events Normal {"object": {"kind":"ConfigMap","namespace":"open-cluster-management","name":"d362c584.cluster.open-cluster-management.io","uid":"aa8d8ae1-5f68-4506-ba94-1c823584dee4","apiVersion":"v1","resourceVersion":"45350134"}, "reason": "LeaderElection", "message": "cluster-curator-controller-76bc4968b5-zbsfn_d2f9927a-e079-4a75-acfb-240d7cda7ac8 became leader"} 2022-04-26T23:01:06.832Z DEBUG events Normal {"object": {"kind":"Lease","namespace":"open-cluster-management","name":"d362c584.cluster.open-cluster-management.io","uid":"42d06700-a958-4c44-8e39-709217f798d2","apiVersion":"coordination.k8s.io/v1","resourceVersion":"45350139"}, "reason": "LeaderElection", "message": "cluster-curator-controller-76bc4968b5-zbsfn_d2f9927a-e079-4a75-acfb-240d7cda7ac8 became leader"} 2022-04-26T23:01:07.133Z INFO controller.clustercurator Starting EventSource {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ClusterCurator", "source": "kind source: /, Kind="} 2022-04-26T23:01:07.133Z INFO controller.clustercurator Starting Controller {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ClusterCurator"} 2022-04-26T23:01:07.632Z INFO controller.clustercurator Starting workers {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ClusterCurator", "worker count": 1}
B2Gsync Seems similar to https://bugzilla.redhat.com/show_bug.cgi?id=2005759 It looks like we did not deliver a fix for 2.4.z but that something was improved for 2.5
When upgrading a managed cluster, cluster-curator-controller will create a job to monitor the progress of the upgrading and update the status of the clustercurator CR accordingly. If the job fails for some reason, the status of the clustercurator CR will stuck in a stale status, while the upgrading may have already been completed. ACM console reads the upgrading status from the clustercurator CR, If it find the cluster is in a upgrading status, it will not show upgrade options on the UI. That's the root cause of this issue. It has been fixed in 2.5 release and the fix should be backport to 2.4 as well.
The fix has been merged. It will be available in ACM 2.4.6.
Verified on 2.4.6-DOWNSTREAM-2022-09-07-18-40-46 Cluster update status was visible on both UI and cluster curator yaml Upgrade was completed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Advanced Cluster Management 2.4.6 security update and bug fixes), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6696