Created attachment 1679456 [details] Screenshot of creation UI showing missing storage class input entry Description of problem (please be detailed as possible and provide log snippests): Installed OCS 4.3 on OCP 4.4 and created a storage class. On AWS and GCP if I select the nodes and hit Create I get a working cluster using the default storage class. On Azure doing that, it uses storage class "null". I have to go back and edit the yaml to specify the default storage class. Going back to the cluster creation UI it looks like there is supposed to be a dialog box for Storage Class but it is missing (see attached screenshot). Accepting the defaults results in no cluster creation and the following in the ocs-operator log: {"level":"error","ts":"2020-04-16T18:00:46.824Z","logger":"controller-runtime.controller","msg":"Reconciler error","controller":"storagecluster-controller","request":"openshift-storage/ocs-storagecluster","error":"failed to validate Stora geDeviceSet 0: no StorageClass specified","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/g o/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/con troller.go:192\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Cont roller).worker-fm\n\t/go/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/u til/wait.JitterUntil\n\t/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88"} Version of all relevant components (if applicable): 4.4 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Default storage cluster creation fails on Azure Is there any workaround available to the best of your knowledge? Manually specify storage class in YAML after custom resource creation Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes Scenario works ok on AWS and GCP, fails on Azure
Here's the yaml for the default sc on Azure allowVolumeExpansion: true apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: annotations: storageclass.kubernetes.io/is-default-class: "true" creationTimestamp: "2020-04-16T16:38:07Z" name: managed-premium ownerReferences: - apiVersion: v1 kind: clusteroperator name: storage uid: ec4bdb0f-1116-4a34-b8f4-8364cc0f2d31 resourceVersion: "10031" selfLink: /apis/storage.k8s.io/v1/storageclasses/managed-premium uid: 0b67762f-d797-45e7-b976-5890c4b975bf parameters: kind: Managed storageaccounttype: Premium_LRS provisioner: kubernetes.io/azure-disk reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer
We don't currently support OCS on Azure. We'll looking into it and should have it on the roadmap soon. For now, moving this to OCS 4.5.
Pulkit, can you check this? Isnt the storage class selected as managed-premium by default for Azure?
Nishanth, is this a known issue in OCP 4.4?
This is also seen while deploying ocs on gcp also.
Sahina, This is fixed in 4.5 & was never on track to backport it to 4.4. Also for Engineering point of view, the fix was done as a part of refactoring. The backport of that PR is not possible as its a way bigger changes & can break other parts.
Tried installing ocs 4.4 in ocp 4.4.6, Sometimes the storage class is getting created without any issues and sometimes i am experiencing the issue "no StorageClass specified". The issue seems to be arbitrary. Deleting the storage cluster and creating it again solved the issue, in case of "no StorageClass specified"
UI behavior is a bit unpredictable. Sometimes it will pass the default storage class and sometimes it will pass null. This is a bit indeterministic behavior since we are filtering storage classes based on supported infrastructures (AWS, vSphere). If the default storage class is being passed then the installation flow will go as expected. If not(UI sets storage class as null) the best workaround would be to: Go to Storage Cluster CR -> Edit YAML -> Change the StorageClass Name in the YAML and save it. This should fix the issue. A possible fix from OCS Operator would be to select the default storage class if null is being passed from the operator itself.
Azure is tech preview in 4.4 and https://bugzilla.redhat.com/show_bug.cgi?id=1824962#c13 explains the workaround if user hits this issue. In any case this issue is already fixed in 4.5 and backport is tough as this is fixed as part of re-factoring. Moving this out to 4.5 for verification.
As per comment 14, the fix is available in OCP 4.5. Azure is going to be TP with OCP 4.5, so moving this to ON_QA. Nishanth, also acking this for 4.5, unless there's any change needed in OCS.
Testing with OCS 4.5.0-526.ci on OCP 4.5.0-0.nightly-2020-08-20-113617 And I see that when creating StorageCluster CR via OCP Console UI, managed-premium storage class is selected and then actually used as after Storage Cluster installation finishes. ``` $ oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-2c805c37-dc13-4fd2-a66b-1944500bb6de 512Gi RWO Delete Bound openshift-storage/ocs-deviceset-0-data-0-hrwkn managed-premium 4m17s pvc-59213859-e4bf-45b4-aa14-2d3d245fff65 10Gi RWO Delete Bound openshift-storage/rook-ceph-mon-b managed-premium 8m pvc-67a92dfe-455a-46bc-8a9c-693a44499a26 512Gi RWO Delete Bound openshift-storage/ocs-deviceset-1-data-0-25ghr managed-premium 4m16s pvc-bad65e27-ce7e-477d-ba36-e6aad7ae2df3 10Gi RWO Delete Bound openshift-storage/rook-ceph-mon-c managed-premium 7m45s pvc-d8159a8a-2484-4362-b45b-7ca2681bd077 10Gi RWO Delete Bound openshift-storage/rook-ceph-mon-a managed-premium 8m16s pvc-f533e582-6d1d-4733-81ef-550cca895981 512Gi RWO Delete Bound openshift-storage/ocs-deviceset-2-data-0-hj7f4 managed-premium 4m15s pvc-f94ae550-ef1d-4077-8e34-728f5bc603f1 50Gi RWO Delete Bound openshift-storage/db-noobaa-db-0 ocs-storagecluster-ceph-rbd 3m30s ``` OCS 4.5.0-526.ci full version report ==================================== cluster channel: stable-4.5 cluster version: 4.5.0-0.nightly-2020-08-20-113617 cluster image: registry.svc.ci.openshift.org/ocp/release@sha256:8fc4e9c6e998f9b1f990e064af6de9d07bc2719c545c6adccb21ee83d40df9dd storage namespace openshift-cluster-storage-operator image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:55527efb25dc71aa392b59f269afc5fed6a03af1bb0c2fa78a90cc67ac40342b * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:55527efb25dc71aa392b59f269afc5fed6a03af1bb0c2fa78a90cc67ac40342b image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2d0505764aab80d4cc297727f5baea31efd4d8627b5e6f3ebcb6e3c0b82af19b * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2d0505764aab80d4cc297727f5baea31efd4d8627b5e6f3ebcb6e3c0b82af19b image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:874c7266607cdf9cd6996d1a3345a493fd13b7f719263bfae3c10ddaf0ae1132 * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:874c7266607cdf9cd6996d1a3345a493fd13b7f719263bfae3c10ddaf0ae1132 storage namespace openshift-kube-storage-version-migrator image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:df263c82ee7da6142f4cd633b590468005f23e72f61427db3783d0c7b6120b3c * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:df263c82ee7da6142f4cd633b590468005f23e72f61427db3783d0c7b6120b3c storage namespace openshift-kube-storage-version-migrator-operator image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:552c2a0af54aa522e4e7545ce3d6813d7b103aea4a983387bca50a0a1178dc18 * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:552c2a0af54aa522e4e7545ce3d6813d7b103aea4a983387bca50a0a1178dc18 storage namespace openshift-storage image quay.io/rhceph-dev/cephcsi@sha256:6f873f8aaa4367ef835f43c35850d7bb86cc971fac7d0949d4079c58cb6728fc * quay.io/rhceph-dev/cephcsi@sha256:540c0b93f6d2c76845ebbfa96a728b5eb58f08fd4ec78641ba3d23aaadbfcc0c image registry.redhat.io/openshift4/ose-csi-driver-registrar@sha256:39930a20d518455a9776fdae1f70945564fec4acd4f028a66ba9f24ee31bf1dc * registry.redhat.io/openshift4/ose-csi-driver-registrar@sha256:39930a20d518455a9776fdae1f70945564fec4acd4f028a66ba9f24ee31bf1dc image registry.redhat.io/openshift4/ose-csi-external-attacher@sha256:74504ef79d8bb8ec3d517bf47ef5513fcd183190915ef55b7e1ddaca1e98d2cc * registry.redhat.io/openshift4/ose-csi-external-attacher@sha256:74504ef79d8bb8ec3d517bf47ef5513fcd183190915ef55b7e1ddaca1e98d2cc image registry.redhat.io/openshift4/ose-csi-external-provisioner-rhel7@sha256:c237b0349c7aba8b3f32f27392f90ad07e1ca4bede000ff3a6dea34253b2278e * registry.redhat.io/openshift4/ose-csi-external-provisioner-rhel7@sha256:bbdf56eec860aeeead082f54c7a7685a63d54f230df83216493af5623c1d6498 image registry.redhat.io/openshift4/ose-csi-external-resizer-rhel7@sha256:12f6ed87b8b71443da15faa1c521cfac8fd5defeaf2734fb88c3305d8bd71a3d * registry.redhat.io/openshift4/ose-csi-external-resizer-rhel7@sha256:12f6ed87b8b71443da15faa1c521cfac8fd5defeaf2734fb88c3305d8bd71a3d image quay.io/rhceph-dev/mcg-core@sha256:d2e4edc717533ae0bdede3d8ada917cec06a946e0662b560ffd4493fa1b51f27 * quay.io/rhceph-dev/mcg-core@sha256:6a511b8d44d9ced96db9156a0b672f85f2424a671c8a2c978e6f52c1d37fe9e2 image registry.redhat.io/rhscl/mongodb-36-rhel7@sha256:ba74027bb4b244df0b0823ee29aa927d729da33edaa20ebdf51a2430cc6b4e95 * registry.redhat.io/rhscl/mongodb-36-rhel7@sha256:ba74027bb4b244df0b0823ee29aa927d729da33edaa20ebdf51a2430cc6b4e95 image quay.io/rhceph-dev/mcg-operator@sha256:7883296b72541ce63d127cdfa0f92fcdd7d5e977add678365401ac668489c805 * quay.io/rhceph-dev/mcg-operator@sha256:7883296b72541ce63d127cdfa0f92fcdd7d5e977add678365401ac668489c805 image quay.io/rhceph-dev/ocs-operator@sha256:a25b99a86f0fcabf2289c04495a75788e79f5e750425b8b54c056cfae958900c * quay.io/rhceph-dev/ocs-operator@sha256:2987b6300a63a155e8f20637b28f921804bf74bd34c6dbe1202890268a4a8a95 image quay.io/rhceph-dev/rhceph@sha256:eafd1acb0ada5d7cf93699056118aca19ed7a22e4938411d307ef94048746cc8 * quay.io/rhceph-dev/rhceph@sha256:3def885ad9e8440c5bd6d5c830dafdd59edf9c9e8cce0042b0f44a5396b5b0f6 image quay.io/rhceph-dev/rook-ceph@sha256:d2a38f84f0c92d5427b41b9ff2b20db69c765291789e3419909d80255b1bbd7b * quay.io/rhceph-dev/rook-ceph@sha256:38e5d6daaaef3a933b6e2328efeaf79130011d74a77bc0451429e51d7aeaf3ff
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754