Bug 2044344

Summary: ZTP: ACM ManagedCluster shows upgrading forever when executing ZTP platform upgrades by using ClusterVersion source cr
Product: OpenShift Container Platform Reporter: Alberto Losada <alosadag>
Component: Telco EdgeAssignee: Wei Liu <wliu1>
Telco Edge sub component: ZTP QA Contact: yliu1
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: imiller
Version: 4.10   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-26 16:43:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alberto Losada 2022-01-24 12:32:17 UTC
Description of problem:
When executing platform upgrades using the ClusterVersion source CR we realized that even after the spoke cluster has been upgraded successfully the ACM UI shows the cluster as still upgrading (forever)

The managedCluster manifest version is still shown as in the previous version. Notice that the platform upgrade was executed without problems and the spoke cluster is healthy running the desired version.

We realized that the version.openshift.io clusterClaim is not updated to the current version in the spoke cluster.

Version-Release number of selected component (if applicable):
4.10

How reproducible:
Always

Steps to Reproduce:
1. Perform a platform upgrade by using the ClusterVersion source CR (https://github.com/openshift-kni/cnf-features-deploy/blob/master/ztp/source-crs/ClusterVersion.yaml)

An example of PGT that performs a platform upgrade can be found here (https://gitlab.cee.redhat.com/sysdeseng/5g-ericsson/-/blob/master/demos/ztp-policygen/site-policies/cluster-specific-policies/cnfdb1-day2.yaml)

2.Verify the policy is applied to the Hub cluster and the spoke or spokes clusters started the upgrade. Be careful with https://bugzilla.redhat.com/show_bug.cgi?id=2044339 (do not apply multiple changes to the ClusterVersion)

3. Check the upgrade finished OK

4. Check the ACM UI -> Cluster and see that the cluster is showing as yet upgrading

5. Check that the version clusterClaim in the spoke cluster is still showing the previous version

$ oc get clusterclaim version.openshift.io -ojsonpath='{.spec}'

Actual results:
version clusterClaim still shows previous OCP version
ACM UI shows as still upgrading

Expected results:
version clusterClaims must show the current OCP version of the managed cluster
ACM UI must show the current version of the managed cluster

Additional info:
Changing the version.openshift.io clusterClaim to the correct one solves the issue. The problem in the ACM UI is solved too.

Comment 6 Ian Miller 2022-03-07 13:30:05 UTC
Root cause is that the upgrade status was being derived from the console operator which, for far-edge SNO, is disabled. A patch to ACM fixes this in ACM version 2.5 and will be backported to 2.4.3 ~end of march.

https://github.com/stolostron/multicloud-operators-foundation/pull/444

Moving this to ON_QA based on that fix.

Comment 7 yliu1 2022-03-21 21:22:15 UTC
Verified with ACM 2.4.3. 


After spoke upgrade:

# from spoke
oc get clusterclaim version.openshift.io -ojsonpath='{.spec}'
{"value":"4.10.5"}

# from ACM gui, the version is correctly displayed.