Bug 2077050
| Summary: | OCP should default to pd-ssd disk type on GCP | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Hemant Kumar <hekumar> |
| Component: | Storage | Assignee: | Roman Bednář <rbednar> |
| Storage sub component: | Operators | QA Contact: | Chao Yang <chaoyang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | aos-bugs, jsafrane, stbenjam |
| Version: | 4.11 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-10 11:07:44 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Hemant Kumar
2022-04-20 14:44:12 UTC
This applies both to in-tree (in CSO) and CSI (in gcp-pd-csi-driver-operator). Bump of library-go with https://github.com/openshift/library-go/pull/1348 may be needed. Please test upgrade from 4.10, just to make sure the operators have permissions to re-create the storage classes. We discussed this on a meeting: - in-tree SC named "standard" should in fact provide pd-ssd volumes (and we should re-create it during upgrade 4.10->4.11) - CSI driver operator should create "standard-csi" SC as it is + in addition it should create "ssd-csi". ssd-csi will be the future default storage class for new installs when CSI migration is GA. oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE ssd-csi pd.csi.storage.gke.io Delete WaitForFirstConsumer true 4h54m standard (default) kubernetes.io/gce-pd Delete WaitForFirstConsumer true 4h55m standard-csi pd.csi.storage.gke.io Delete WaitForFirstConsumer true 4h54m oc get sc/ssd-csi -o yaml allowVolumeExpansion: true apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: creationTimestamp: "2022-05-12T01:09:06Z" name: ssd-csi resourceVersion: "6129" uid: c35824ad-dcf8-486e-8967-e79d6a449cbf parameters: replication-type: none type: pd-ssd provisioner: pd.csi.storage.gke.io reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer Looks like this broke upgrades. GCP upgrade jobs are failing since this was merged with:
```
{Failed to upgrade storage, operator was not available (DefaultStorageClassController_SyncError): DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden. Failed to upgrade storage, operator was not available (DefaultStorageClassController_SyncError): DefaultStorageClassControllerAvailable: StorageClass.storage.k8s.io "standard" is invalid: parameters: Forbidden: updates to parameters are forbidden.}
```
Example run https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-gcp-upgrade/1525003893585481728
Moving this back to assigned. Please merge the revert and then build any fix on unreverting it: https://github.com/openshift/cluster-storage-operator/pull/283
Upgrade from 4.10 to 4.11.0-0.nightly-2022-05-14-193620 oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-05-13-201737 True True 52m Working towards 4.11.0-0.nightly-2022-05-14-193620: 677 of 801 done (84% complete) oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE ssd-csi pd.csi.storage.gke.io Delete WaitForFirstConsumer true 40m standard (default) kubernetes.io/gce-pd Delete WaitForFirstConsumer true 121m standard-csi pd.csi.storage.gke.io Delete WaitForFirstConsumer true 121m But after upgrading to 4.11, in-tree sc/standard still provide volume with type pd-standard
oc get sc/standard -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: "2022-05-16T00:58:57Z"
name: standard
resourceVersion: "4499"
uid: 54d3127b-7607-439f-8ee9-ba8704c1ec37
parameters:
replication-type: none
type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Will default in-tree storageclass provide pd-ssd volume after upgrade from 4.10 to 4.11? Yes, CSO will change existing in-tree SC to pd-ssd type when upgrading to 4.11. But we need to merge corrected patch first: https://github.com/openshift/cluster-storage-operator/pull/284 oc get sc/standard -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: "2022-06-02T02:22:05Z"
name: standard
resourceVersion: "50672"
uid: 936c2258-7d4c-4ed9-a8b6-34f2ede7d642
parameters:
replication-type: none
type: pd-ssd
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release:4.11.0-0.nightly-2022-06-01-200905 --force=true --allow-explicit-upgrade=true
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |