Bug 1824882 - [GSS][vsphere] Scaling OCS4 on a different StorageClass is not reflected correclty [NEEDINFO]
Summary: [GSS][vsphere] Scaling OCS4 on a different StorageClass is not reflected corr...
Keywords:
Status: CLOSED DUPLICATE of bug 1791532
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: management-console
Version: 4.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Nishanth Thomas
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-16 15:37 UTC by Feras Al Taher
Modified: 2023-09-07 22:50 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-17 06:46:40 UTC
Embargoed:
faltahe: needinfo? (jefbrown)


Attachments (Terms of Use)

Description Feras Al Taher 2020-04-16 15:37:21 UTC
Description of problem (please be detailed as possible and provide log
snippests):
We are trying to scale OCS on vSphere using a different StorageClass. As per OCS GUI Portal and Red Hat Documentation [1], you can choose a different StorageClass when scaling the OCS cluster. However, the resulted YAML does not create another storageDeviceSets with the different StorageClass. it only increases the count of the original storageDeviceSets.


[1] https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.2/html-single/managing_openshift_container_storage/index#proc_scaling-up-storage-by-adding-capacity-to-your-openshift-container-storage-nodes_rhocs


Version of all relevant components (if applicable):
Tested so far in OCS4.2


Does this issue impact your ability to continue to work with the product
There is a workaround, but we are not sure if this workaround is supported???

Is there any workaround available to the best of your knowledge?
We can scale successfully OCS storage and specify a different StorageClass when editing the StorageCluster CR and adding a new storageDeviceSets with a different StorageClass


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
It is very simple to create this Bug


Can this issue reproducible?
YES

Can this issue reproduce from the UI?
YES


Steps to Reproduce:

1. Deployed the OCS cluster with OSD's using 'thin' storage class.
2. While scaling up the Storage cluster, I chose a new storage class 'new-thin-sc' that was using a different datastore.
3. The Storage cluster was scale-up, _but_ storage was consumed from 'thin' SC and not 'new-thin-sc'.
4. Further, checking the StorageCluster CR, the scale operation had only increased the count from 1 to 2 for the original storageDeviceSets.

** Data Captured :

+ Default Storage Class:
$ oc get sc thin -oyaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  creationTimestamp: "2020-04-14T07:42:02Z"
  name: thin
  ownerReferences:
  - apiVersion: v1
    kind: clusteroperator
    name: storage
    uid: a029e6a6-0a47-4d4a-9088-7a027252b398
  resourceVersion: "9847"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/thin
  uid: a3e32ab4-21e5-4261-a811-d669eb0dfd97
parameters:
  diskformat: thin
provisioner: kubernetes.io/vsphere-volume
reclaimPolicy: Delete
volumeBindingMode: Immediate


+ New Storage Class using a different datastore:
$ oc get sc new-thin-sc -oyaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
  creationTimestamp: "2020-04-14T11:17:37Z"
  name: new-thin-sc
  ownerReferences:
  - apiVersion: v1
    kind: clusteroperator
    name: storage
    uid: a029e6a6-0a47-4d4a-9088-7a027252b398
  resourceVersion: "96067"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/new-thin-sc
  uid: 7d921c90-e1d9-4c07-b063-3e01eb8d7c02
parameters:
  datastore: NFSDatastore1
  diskformat: thin
provisioner: kubernetes.io/vsphere-volume
reclaimPolicy: Delete
volumeBindingMode: Immediate


+ StorageCluster CR after initial deployment:
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  creationTimestamp: "2020-04-14T08:07:33Z"
  generation: 1
  name: ocs-storagecluster
  namespace: openshift-storage
  resourceVersion: "100000"
  selfLink: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
  uid: eb058c82-6c99-4152-a19b-d60ca6d99b6e
spec:
  manageNodes: false
  storageDeviceSets:
  - count: 1
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 2Ti
        storageClassName: thin
        volumeMode: Block
    name: ocs-deviceset
    placement: {}
    portable: true
    replica: 3
    resources: {}
status:
  cephBlockPoolsCreated: true
  cephFilesystemsCreated: true
  cephObjectStoreUsersCreated: true
  cephObjectStoresCreated: true
  conditions:
  - lastHeartbeatTime: "2020-04-14T11:27:57Z"
    lastTransitionTime: "2020-04-14T08:07:36Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "True"
    type: ReconcileComplete
  - lastHeartbeatTime: "2020-04-14T11:27:57Z"
    lastTransitionTime: "2020-04-14T08:22:17Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "True"
    type: Available
  - lastHeartbeatTime: "2020-04-14T11:27:57Z"
    lastTransitionTime: "2020-04-14T09:49:03Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "False"
    type: Progressing
  - lastHeartbeatTime: "2020-04-14T11:27:57Z"
    lastTransitionTime: "2020-04-14T08:07:33Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "False"
    type: Degraded
  - lastHeartbeatTime: "2020-04-14T11:27:57Z"
    lastTransitionTime: "2020-04-14T09:49:03Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "True"
    type: Upgradeable
  failureDomain: rack
  nodeTopologies:
    labels:
      topology.rook.io/rack:
      - rack0
      - rack1
      - rack2
  phase: Ready
  relatedObjects:
  - apiVersion: ceph.rook.io/v1
    kind: CephCluster
    name: ocs-storagecluster-cephcluster
    namespace: openshift-storage
    resourceVersion: "99998"
    uid: ce9c3b9e-29ae-455c-88a2-8318afe69667
  - apiVersion: noobaa.io/v1alpha1
    kind: NooBaa
    name: noobaa
    namespace: openshift-storage
    resourceVersion: "29784"
    uid: 03907609-6011-495d-a0d6-1db9ddcaa077
  storageClassesCreated: true


+ StorageCluster CR after scaling up using different SC:
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  creationTimestamp: "2020-04-14T08:07:33Z"
  generation: 2
  name: ocs-storagecluster
  namespace: openshift-storage
  resourceVersion: "102226"
  selfLink: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
  uid: eb058c82-6c99-4152-a19b-d60ca6d99b6e
spec:
  manageNodes: false
  storageDeviceSets:
  - count: 2
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 2Ti
        storageClassName: thin
        volumeMode: Block
    name: ocs-deviceset
    placement: {}
    portable: true
    replica: 3
    resources: {}
status:
  cephBlockPoolsCreated: true
  cephFilesystemsCreated: true
  cephObjectStoreUsersCreated: true
  cephObjectStoresCreated: true
  conditions:
  - lastHeartbeatTime: "2020-04-14T11:33:02Z"
    lastTransitionTime: "2020-04-14T08:07:36Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "True"
    type: ReconcileComplete
  - lastHeartbeatTime: "2020-04-14T11:33:02Z"
    lastTransitionTime: "2020-04-14T08:22:17Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "True"
    type: Available
  - lastHeartbeatTime: "2020-04-14T11:33:02Z"
    lastTransitionTime: "2020-04-14T11:31:04Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "False"
    type: Progressing
  - lastHeartbeatTime: "2020-04-14T11:33:02Z"
    lastTransitionTime: "2020-04-14T08:07:33Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "False"
    type: Degraded
  - lastHeartbeatTime: "2020-04-14T11:33:02Z"
    lastTransitionTime: "2020-04-14T11:31:04Z"
    message: Reconcile completed successfully
    reason: ReconcileCompleted
    status: "True"
    type: Upgradeable
  failureDomain: rack
  nodeTopologies:
    labels:
      topology.rook.io/rack:
      - rack0
      - rack1
      - rack2
  phase: Ready
  relatedObjects:
  - apiVersion: ceph.rook.io/v1
    kind: CephCluster
    name: ocs-storagecluster-cephcluster
    namespace: openshift-storage
    resourceVersion: "102217"
    uid: ce9c3b9e-29ae-455c-88a2-8318afe69667
  - apiVersion: noobaa.io/v1alpha1
    kind: NooBaa
    name: noobaa
    namespace: openshift-storage
    resourceVersion: "29784"
    uid: 03907609-6011-495d-a0d6-1db9ddcaa077
  storageClassesCreated: true


+ must-gather location:
  https://drive.google.com/open?id=1BlkuaXpAGiVh8_ZsRKwGEzC0331rBKwe

Actual results:


Expected results:


Additional info:

Comment 4 Feras Al Taher 2020-04-17 10:21:04 UTC
Hey Bipin,

I surprised that you closed this bug. First of all the documentation and public statement of ocs4.3 explain that you can choose different StroageClass. Can you please confirm that OCS4.3 work as expected. and this issue is fixed in OCS4.3. This is a BUG because the functionality does not work in OCS4.2.

I advise you to first track if this bug got fixed or not in OCS4.3. because if it wasn't then you should at least block users from choosing a different storageclass in the OCS portal.

Comment 8 Raz Tamir 2020-04-17 17:25:07 UTC
The documentation https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.2/html-single/managing_openshift_container_storage/index#proc_scaling-up-storage-by-adding-capacity-to-your-openshift-container-storage-nodes_rhocs


Clearly says to "From this dialog box, you can set the requested additional capacity and the storage class. The size should always be set in multiples of 2TiB. On AWS, the storage class should be set to gp2. On VMWare, the storage class should be set to thin."

Based on this, selecting other storageclass other than thin or gp2 is not recommended


Note You need to log in before you can comment on or make changes to this bug.