Bug 2098655 - gcp cluster rollback fails due to storage failure
Summary: gcp cluster rollback fails due to storage failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.10.z
Assignee: Roman Bednář
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On: 2057495
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-20 09:16 UTC by Evgeni Vakhonin
Modified: 2022-08-01 11:36 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-01 11:34:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Must-gather (15.86 MB, application/gzip)
2022-06-20 09:16 UTC, Evgeni Vakhonin
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-storage-operator pull 300 0 None open Bug 2098655: gcp cluster rollback fails due to storage failure 2022-07-18 06:39:41 UTC
Github openshift gcp-pd-csi-driver-operator pull 49 0 None Merged Bug 2098655: gcp cluster rollback fails due to storage failure 2022-06-27 07:05:05 UTC
Red Hat Product Errata RHSA-2022:5730 0 None None None 2022-08-01 11:36:13 UTC

Description Evgeni Vakhonin 2022-06-20 09:16:58 UTC
Created attachment 1891261 [details]
Must-gather

upgrading a cluster from 4.10.18 to 4.11.0-0.nightly-2022-06-15-222801 and rolling back to 4.10.18 resulted in the rollback to stuck at:
> info: An upgrade is in progress. Working towards 4.10.18: 611 of 771 done (79% complete), waiting on openshift-controller-manager

checking cluster operators revealed storage operator with the following error:
> storage    4.10.18    False       True          True       97m     DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.

In cvo log:
> I0616 22:37:31.259361       1 sync_worker.go:1001] Update error 471 of 771: ClusterOperatorNotAvailable Cluster operator storage is not available (*errors.errorString: cluster operator storage is Available=False: DefaultStorageClassController_SyncError: DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.)

In storage operator log:
> 2022-06-16T22:27:39.303206345Z I0616 22:27:39.296310       1 event.go:285] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator", UID:"42162595-237b-4488-9802-b1823ab6cf2e", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'StorageClassUpdateFailed' Failed to update StorageClass.storage.k8s.io/standard: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.

> 2022-06-16T22:27:39.390217370Z I0616 22:27:39.388552       1 status_controller.go:211] clusteroperator/storage diff {"status":{"conditions":[{"lastTransitionTime":"2022-06-16T20:21:04Z","message":"GCPPDCSIDriverOperatorCRDegraded: All is well","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2022-06-16T22:27:39Z","message":"DefaultStorageClassControllerProgressing: StorageClass.storage.k8s.io \"standard\" is invalid: parameters: Forbidden: updates to parameters are forbidden.","reason":"DefaultStorageClassController_SyncError","status":"True","type":"Progressing"},{"lastTransitionTime":"2022-06-16T22:27:39Z","message":"DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io \"standard\" is invalid: parameters: Forbidden: updates to parameters are forbidden.","reason":"DefaultStorageClassController_SyncError","status":"False","type":"Available"},{"lastTransitionTime":"2022-06-16T20:21:05Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}}

How reproducible:
1/1

Steps to Reproduce:
upgrade 4.10.18 to 4.11 and rollback

Additional info:
Must-gather attached

Comment 3 Chao Yang 2022-06-27 06:58:46 UTC
oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-25-081133   True        True          49m     Unable to apply 4.10.0-0.nightly-2022-06-08-150219: an unknown error has occurred: MultipleErrors

storage                                    4.10.0-0.nightly-2022-06-08-150219   False       True          True       15m     DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.

I0627 06:38:47.826705       1 event.go:285] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator", UID:"ee3edf35-999a-442a-81cf-9db73bff7909", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/storage changed: Progressing message changed from "DefaultStorageClassControllerProgressing: StorageClass.storage.k8s.io \"standard\" is invalid: parameters: Forbidden: updates to parameters are forbidden.\nGCPPDCSIDriverOperatorCRProgressing: GCPPDDriverNodeServiceControllerProgressing: Waiting for DaemonSet to deploy node pods" to "DefaultStorageClassControllerProgressing: StorageClass.storage.k8s.io \"standard\" is invalid: parameters: Forbidden: updates to parameters are forbidden."
I0627 06:43:42.541768       1 controller.go:174] Existing StorageClass standard found, reconciling
E0627 06:43:42.555146       1 base_controller.go:272] DefaultStorageClassController reconciliation failed: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.

Comment 4 Roman Bednář 2022-06-27 12:26:30 UTC
The patch is not included in the last accepted nightly build, please wait for a more recent accepted one and try again.

Comment 6 Chao Yang 2022-07-01 09:19:33 UTC
oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-28-160049   True        True          43m     Working towards 4.10.0-0.nightly-2022-06-28-220435: 611 of 771 done (79% complete), waiting on openshift-controller-manager

oc get co/storage
NAME      VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
storage   4.10.0-0.nightly-2022-06-28-220435   False       True          True       18m     DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.

@rbednar I updated this bug as failed now. Please correct me if anything wrong.

Comment 9 Chao Yang 2022-07-25 06:04:56 UTC
oc get co/storage
NAME      VERSION                         AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
storage   4.10.0-0.ci-2022-07-24-093552   True        False         False      4h38m   

oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-07-24-151159   True        True          165m    Working towards 4.10.0-0.ci-2022-07-24-093552: 615 of 774 done (79% complete), waiting on openshift-controller-manager

co/storage could roll back now.
co/openshift-controller-manager is tracking here https://bugzilla.redhat.com/show_bug.cgi?id=2090274

Comment 12 errata-xmlrpc 2022-08-01 11:34:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.25 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5730


Note You need to log in before you can comment on or make changes to this bug.