Bug 2038954 - platform upgrade policy cannot determine whether a cluster is compliant or not
Summary: platform upgrade policy cannot determine whether a cluster is compliant or not
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Telco Edge
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.10.0
Assignee: Nishant Parekh
QA Contact: yliu1
URL:
Whiteboard:
Depends On: 2059359
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-10 16:03 UTC by yliu1
Modified: 2022-03-21 12:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2059359 (view as bug list)
Environment:
Last Closed: 2022-03-21 12:40:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni cnf-features-deploy pull 992 0 None Merged Bug 2038954: ztp: PGT supports status updates or overlays 2022-03-06 18:00:05 UTC
Github openshift-kni cnf-features-deploy pull 996 0 None open Bug 2038954: ztp: ClusterVersion changes for platform upgrade 2022-03-06 17:57:34 UTC
Red Hat Product Errata RHBA-2022:0928 0 None None None 2022-03-21 12:40:22 UTC

Description yliu1 2022-01-10 16:03:51 UTC
Description of problem:
Currently we rely on a ClusterVersion policy to determine whether a spoke cluster is upgraded(comppliant) or not. But ClusterVersion CR itself does not have a field for existing cluster version (except searching for latest Completed version/image in history field), we are setting the "desiredUpdate" spec to determine compliance, and this doesn't work in various scenarios: 
1. after fresh deployment, desiredUpdate is always null, meaning the policy will always be NonCompliant even if we set desiredUpdate to existing cluster version.
2. this policy compliance only indidates whether the CR is patched or not, it could be patched but cannot be upgraded for many reasons (e.g., non-existed version number, unsigned image.. etc), but in this case, even if cluster is NOT upgraded to desired version, the policy will be Compliant. 
3. Even if in an ideal scenario, where the policy got applied successfully, the policy will become compliant right after the ClusterVersion CR is updated, meaning cluster upgrade has just started (as opposed to completed).

Version-Release number of selected component (if applicable):
4.10

How reproducible:
100%

Steps to Reproduce:
1. trying to start platform upgrade using this policy (https://gitlab.cee.redhat.com/ran/lab-ztp/-/commit/aa0d7be5d2f3e7059391a66bee09712dd4c07b7d) from following source-cr: https://github.com/openshift-kni/cnf-features-deploy/blob/master/ztp/source-crs/ClusterVersion.yaml

Actual results:
- candidate clusters and upgrade status/results cannot be properly determined

Expected results:
- this policy should correctly indicate whether cluster's existing version matches with desired version

 
Additional info:

Comment 1 yliu1 2022-01-10 16:57:15 UTC
Maybe use conditions Available to determine current version: 
  conditions:
  - lastTransitionTime: "2022-01-07T23:25:43Z"
    message: Done applying 4.9.12
    status: "True"
    type: Available

Comment 7 errata-xmlrpc 2022-03-21 12:40:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.5 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0928


Note You need to log in before you can comment on or make changes to this bug.