Bug 1927515

Summary: 4.5.15 and later cluster-version operator does not sync ClusterVersion status before exiting, leaving 'verified: false' even for verified updates
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Cluster Version OperatorAssignee: W. Trevor King <wking>
Status: CLOSED ERRATA QA Contact: Yang Yang <yanyang>
Severity: high Docs Contact:
Priority: high    
Version: 4.5CC: aos-bugs, jokerman, lmohanty, wking, yanyang
Target Milestone: ---Keywords: Regression, Reopened, Upgrades
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The cluster-version operator was not syncing ClusterVersion during graceful shutdowns. Consequence: During updates, the outgoing cluster-version operator was likely to exit after verifying the incoming release, but before pushing the 'verified: true' value into ClusterVersion history. Fix: The cluster-version operator now allows some additional time to perform a final ClusterVersion status synchronization during graceful shutdowns. Result: The ClusterVersion 'verified' values are again consistently 'true' for releases which were verified before being applied, returning to the behavior we had before 4.5.15 and 4.6.0.
Story Points: ---
Clone Of:
: 1931025 (view as bug list) Environment:
Last Closed: 2021-03-02 04:48:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1916384    
Bug Blocks: 1931025    

Comment 1 W. Trevor King 2021-02-11 14:54:28 UTC
This fixes a regression introduced in 4.5.15 [1].  There have been subsequent 4.5.z since then, so this not a regression since the most recent 4.5.z.  Setting blocker-.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1872906#c10

Comment 3 Yang Yang 2021-02-20 05:33:45 UTC
Verified with 4.6.0-0.nightly-2021-02-19-171718.

Steps to verify it are:

1) Install a cluster with 4.6.0-0.nightly-2021-02-19-171718
2) Create a dummy cincy server with 4.6.0-0.nightly-2021-02-19-171718 and 4.7.0-rc.0
3) Upgrade the cluster
4) Check the verify state is true

# oc get clusterversion -oyaml
    history:
    - completionTime: "2021-02-20T05:10:53Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:497fa748b89619aba312a926a0be0ad155c4b894ca3e2824a00167421e3441b0
      startedTime: "2021-02-20T04:01:00Z"
      state: Completed
      verified: true   <----- The state is changed to True.
      version: 4.7.0-rc.0
    - completionTime: "2021-02-20T02:34:24Z"
      image: registry.ci.openshift.org/ocp/release@sha256:3a503b0ea95f8bcd529d8255fa3170de75689c93c49b53e45cb47751e9e27742
      startedTime: "2021-02-20T02:01:38Z"
      state: Completed
      verified: false
      version: 4.6.0-0.nightly-2021-02-19-171718
    observedGeneration: 3
    versionHash: Q_5mmldRekY=

Moving it to verified state.

Comment 6 errata-xmlrpc 2021-03-02 04:48:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.19 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0634