Bug 1970150

Summary: master pool is still upgrading when machine config reports level / restarts on osimageurl change
Product: OpenShift Container Platform Reporter: Kirsten Garrison <kgarriso>
Component: Machine Config OperatorAssignee: Kirsten Garrison <kgarriso>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.8CC: ccoleman, rioliu
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:12:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1970154    
Attachments:
Description Flags
upgrade progression 1
none
upgrade progression 2
none
upgrade progression 3
none
upgrade progression 4
none
upgrade progression 5 none

Description Kirsten Garrison 2021-06-09 22:31:57 UTC
When there is no new MCO commit but there is an osimageurl change the master pool is still upgrading when the MCO reports level to the CVO.

This is a copy of Bug #1955929, which seems to address the issue, however there are some failing runs related to https://bugzilla.redhat.com/show_bug.cgi?id=1968754 so using this BZ to carry the fix which drastically reduced failures and keeping the other BZ open to audit after the new metal-ipi bug is fixed.


This bug was initially created as a copy of Bug #1955929

May  1 01:39:28.369: INFO: cluster upgrade is Progressing: Working towards 4.8.0-0.nightly-2021-05-01-000412: 652 of 675 done (96% complete)
May  1 01:39:38.369: INFO: Completed upgrade to registry.build01.ci.openshift.org/ci-op-ns22yv9h/release@sha256:1aeba3cfeb93d5912390fbffafaa3d024ae8db26489b01b2fa034d421f69b5db
May  1 01:39:38.460: INFO: Waiting on pools to be upgraded
May  1 01:39:38.632: INFO: Pool master is still reporting (Updated: false, Updating: true, Degraded: false)
May  1 01:39:38.632: INFO: Invariant violation detected: the "master" pool should be updated before the CVO reports available at the new version

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-upgrade/1388283995501891584

Urgent because it’s happened in 38% of the last 16 upgrade jobs in nightly

https://search.ci.openshift.org/?search=Pool+master+is+still+reporting&maxAge=48h&context=1&type=build-log&name=upgrade&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Comment 1 Michael Nguyen 2021-06-10 13:47:20 UTC
Created attachment 1789861 [details]
upgrade progression 1

Comment 2 Michael Nguyen 2021-06-10 13:47:55 UTC
Created attachment 1789863 [details]
upgrade progression 2

Comment 3 Michael Nguyen 2021-06-10 13:48:47 UTC
Created attachment 1789864 [details]
upgrade progression 3

Comment 4 Michael Nguyen 2021-06-10 13:49:11 UTC
Created attachment 1789865 [details]
upgrade progression 4

Comment 5 Michael Nguyen 2021-06-10 13:50:23 UTC
Created attachment 1789866 [details]
upgrade progression 5

Comment 6 Michael Nguyen 2021-06-10 13:52:28 UTC
Verified on  registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-06-10-014052.  Upgraded to  registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-06-10-045932 which has no new MCO commit and a new osImageURL.  Watched `oc get co/machine-config` `oc get clusterversion` `oc get mcp`.  Verified the `co/machine-config` did not transition to the new version until the master pool completed updating.  See attachments.

Comment 8 Kirsten Garrison 2021-07-02 00:42:14 UTC
*** Bug 1955929 has been marked as a duplicate of this bug. ***

Comment 10 errata-xmlrpc 2021-07-27 23:12:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438