Description of problem: While attempting to upgrade cluster from 4.6.17 to 4.10.9 through the upgrade path detailed below, the final upgrade from 4.9.28 to 4.10.9 was blocked with message: Cluster minor level upgrades are not allowed while resource deletions are in progress; resources=PrometheusRule "openshift-authentication-operator/authentication-operator",rolebinding "openshift-machine-api/machine-api-termination-handler",PrometheusRule "openshift-kube-apiserver/kube-apiserver",role "openshift-machine-api/machine-api-termination-handler" The issue was eventually resolved by resetting the cluster upgrade using the command: $ oc adm upgrade --clear The upgrade from 4.9.28 to 4.10.9 was later reattempted and completed successfully. Upgrade path taken: 4.6.17 -> 4.6.41 -> 4.7.43 -> 4.8.36 -> 4.9.28 -> 4.10.9 Version-Release number of the following components: OCP 4.9.28 VSphere How reproducible: N/A Steps to Reproduce: Customer upgraded cluster as per the upgrade path described above Actual results: See description above Expected results: Cluster upgrade to progress normally Additional info: Must-gathers from before and after upgrade available from the case
(In reply to Paul Webster from comment #0) > Description of problem: > While attempting to upgrade cluster from 4.6.17 to 4.10.9 through the > upgrade path detailed below, the final upgrade from 4.9.28 to 4.10.9 was > blocked with message: > > Cluster minor level upgrades are not allowed while resource deletions are in > progress; resources=PrometheusRule > "openshift-authentication-operator/authentication-operator",rolebinding > "openshift-machine-api/machine-api-termination-handler",PrometheusRule > "openshift-kube-apiserver/kube-apiserver",role > "openshift-machine-api/machine-api-termination-handler" > > The issue was eventually resolved by resetting the cluster upgrade using the > command: > > $ oc adm upgrade --clear > Any idea how long they waited on the first upgrade request before "clear"ing it?
This is a known issue and will require a back port of https://bugzilla.redhat.com/show_bug.cgi?id=1822752 to fix.
Reproduced on path 4.8.36 -> 4.9.28 -> 4.10.9 1. Trigger upgrade from 4.8.36 to 4.9.28. 2. Monitor above upgrade, once it finishes, trigger a new upgrade to 4.10(w/o --force) immediately while there is still Upgradeable=False condition (It’s a very short period before it run into ResourceDeletesInProgress status, if we did not trigger the upgrade in this period, then no issue) 3. After trigger the upgrade w/o--force while upgradeable=false, no upgrade will happen as expected and it will prompt `it may not be safe to apply this update` error due to Upgradeable=False. # ./oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.28 True True 23m Unable to apply 4.10.23: it may not be safe to apply this update 4. Do nothing to wait for ResourceDeletes(>30min), the ResourceDeletes does not complete with the above status stuck(unexpected) # ./oc adm upgrade info: An upgrade is in progress. Unable to apply 4.10.9: it may not be safe to apply this update Upgradeable=False Reason: ResourceDeletesInProgress Message: Cluster minor level upgrades are not allowed while resource deletions are in progress; resources=PrometheusRule "openshift-kube-apiserver/kube-apiserver" 5. Run `oc adm upgrade –clear` to cancel the update to 4.10.9 due to Upgradeable=False and then re-trigger the update, upgrade start successfully.
Verified on 4.8.46 -> 4.9.0-0.nightly-2022-07-21-221241 -> 4.10.24 At the beginning, it still prompts the error due to we trigger upgrade while upgradeable=false. # ./oc adm upgrade info: An upgrade is in progress. Unable to apply 4.10.24: it may not be safe to apply this update Upgradeable=False Reason: ResourceDeletesInProgress Message: Cluster minor level upgrades are not allowed while resource deletions are in progress; resources=PrometheusRule "openshift-kube-apiserver/kube-apiserver" Do nothing to wait for ResourceDeletes, after several minutes, the ResourceDeletes complete and the upgrade starts successfully and succeeds finally. # ./oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2022-07-21-221241 True True 4m57s Working towards 4.10.24: 95 of 773 done (12% complete)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.9.45 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5879