Bug 2038954

Summary: platform upgrade policy cannot determine whether a cluster is compliant or not
Product: OpenShift Container Platform Reporter: yliu1
Component: Telco EdgeAssignee: Nishant Parekh <nparekh>
Telco Edge sub component: RAN QA Contact: yliu1
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: alosadag, grajaiya, keyoung, mcornea
Version: 4.10   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2059359 (view as bug list) Environment:
Last Closed: 2022-03-21 12:40:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2059359    
Bug Blocks:    

Description yliu1 2022-01-10 16:03:51 UTC
Description of problem:
Currently we rely on a ClusterVersion policy to determine whether a spoke cluster is upgraded(comppliant) or not. But ClusterVersion CR itself does not have a field for existing cluster version (except searching for latest Completed version/image in history field), we are setting the "desiredUpdate" spec to determine compliance, and this doesn't work in various scenarios: 
1. after fresh deployment, desiredUpdate is always null, meaning the policy will always be NonCompliant even if we set desiredUpdate to existing cluster version.
2. this policy compliance only indidates whether the CR is patched or not, it could be patched but cannot be upgraded for many reasons (e.g., non-existed version number, unsigned image.. etc), but in this case, even if cluster is NOT upgraded to desired version, the policy will be Compliant. 
3. Even if in an ideal scenario, where the policy got applied successfully, the policy will become compliant right after the ClusterVersion CR is updated, meaning cluster upgrade has just started (as opposed to completed).

Version-Release number of selected component (if applicable):
4.10

How reproducible:
100%

Steps to Reproduce:
1. trying to start platform upgrade using this policy (https://gitlab.cee.redhat.com/ran/lab-ztp/-/commit/aa0d7be5d2f3e7059391a66bee09712dd4c07b7d) from following source-cr: https://github.com/openshift-kni/cnf-features-deploy/blob/master/ztp/source-crs/ClusterVersion.yaml

Actual results:
- candidate clusters and upgrade status/results cannot be properly determined

Expected results:
- this policy should correctly indicate whether cluster's existing version matches with desired version

 
Additional info:

Comment 1 yliu1 2022-01-10 16:57:15 UTC
Maybe use conditions Available to determine current version: 
  conditions:
  - lastTransitionTime: "2022-01-07T23:25:43Z"
    message: Done applying 4.9.12
    status: "True"
    type: Available

Comment 7 errata-xmlrpc 2022-03-21 12:40:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.5 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0928