Description of problem (please be detailed as possible and provide log snippests): Version of all relevant components (if applicable): Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Sorry for the blank description above Description of problem (please be detailed as possible and provide log snippets): With an OCS 4.3 cluster deployed over vSphere, when attempting to add capacity from the console, and picking the 'thin' storage class, the Raw Capacity is grayed out with 'NaN TB' mentioned. Version of all relevant components (if applicable): OpenShift Version 4.3.0-0.nightly-2020-03-20-053743 OCS operator 4.3.0-377.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? I can proceed with add capacity but there is no way to know how much will be added to the storage cluster. See attached screenshot Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Add capacity for a cluster deployed over vSphere and check 'Raw Capacity' 2. 3.
Created attachment 1672693 [details] Screenshot of Add Capacity prompt
(In reply to Elad from comment #1) > > Description of problem (please be detailed as possible and provide log > snippets): > With an OCS 4.3 cluster deployed over vSphere, when attempting to add > capacity from the console, and picking the 'thin' storage class, the Raw > Capacity is grayed out with 'NaN TB' mentioned. How did you add the initial capacity? Thanks - Michael
Very important detail that is missing from the description (comment #1) - this cluster was deployed with an unsupported OSD size: [Elad@localhost ocs-ci]$ oc get pvc -n openshift-storage NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE db-noobaa-db-0 Bound pvc-eb44ee61-3064-4054-bb2d-745223301885 50Gi RWO ocs-storagecluster-ceph-rbd 2d20h ocs-deviceset-0-0-wwqh8 Bound pvc-4bd17a20-1ce1-402e-ba94-37c8b29d2a11 340Gi RWO thin 2d20h ocs-deviceset-0-1-9tfgf Bound pvc-20b003e5-d657-4c7a-b823-44005f8a5054 340Gi RWO thin 41m ocs-deviceset-1-0-cwdk8 Bound pvc-9615e2a5-dc33-48f6-8d1b-98db56364eee 340Gi RWO thin 2d20h ocs-deviceset-1-1-dpn2x Bound pvc-ee85b8ee-4d01-4b71-9c82-12a7ba2948c8 340Gi RWO thin 41m ocs-deviceset-2-0-2phw9 Bound pvc-aab897e3-ed1f-4aff-aa9b-c0d77cd80ec7 340Gi RWO thin 2d20h ocs-deviceset-2-1-4dbzf Bound pvc-04070c9d-2df0-4da7-bdc2-d07e4448d9bf 340Gi RWO thin 41m rook-ceph-mon-a Bound pvc-b8859e62-eeba-4d36-b563-819dfb8a2312 10Gi RWO thin 2d20h rook-ceph-mon-b Bound pvc-6774f65d-3912-4dce-a8eb-0bf2f922e7a7 10Gi RWO thin 2d20h rook-ceph-mon-c Bound pvc-52546f3e-1f50-4621-9783-d8b2835e1d11 10Gi RWO thin 2d20h rook-ceph-mon-d Bound pvc-774abe36-e0b0-4a79-aacb-7c7b2d133feb 10Gi RWO thin 2d18h rook-ceph-mon-e Bound pvc-fdffcd13-8b50-4f99-a763-926515144aff 10Gi RWO thin 2d5h So, I will check with one of the supported sizes (0.5/2/4). With that said, we should fix this too so the calculation will be generic for any OSD size. That can be fixed after 4.3
Created attachment 1672744 [details] Add capacity screenshoot With below version of OCS and OCP, while adding capacity to cluster not observed NaN value for cluster capacity. Please refer image attached for more details. (venv) [rperiyas@localhost ocs-ci]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2020-03-20-053743 True False 8h Cluster version is 4.3.0-0.nightly-2020-03-20-053743 (venv) [rperiyas@localhost ocs-ci]$ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE lib-bucket-provisioner.v1.0.0 lib-bucket-provisioner 1.0.0 Succeeded ocs-operator.v4.3.0-377.ci OpenShift Container Storage 4.3.0-377.ci Succeeded (venv) [rperiyas@localhost ocs-ci]$
more info: Cluster deployed with 0.5TB OSD size (venv) [rperiyas@localhost ocs-ci]$ oc get pvc -n openshift-storage NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE db-noobaa-db-0 Bound pvc-023796fa-aec9-425a-8885-20e66cb8d937 50Gi RWO ocs-storagecluster-ceph-rbd 7h58m ocs-deviceset-0-0-pjm5g Bound pvc-4b6affbf-2166-4e9b-998c-a3f01c48bb53 512Gi RWO thin 7h59m ocs-deviceset-0-1-lk7jp Bound pvc-fe7b9b5a-805f-4b12-9f4c-e6153625111a 512Gi RWO thin 7h55m ocs-deviceset-0-2-944c2 Bound pvc-5f541d18-d687-4729-a6c8-98d9813af35d 512Gi RWO thin 7h52m ocs-deviceset-1-0-svcm5 Bound pvc-496f5078-33fd-4518-9f35-0cbda774bfc7 512Gi RWO thin 7h59m ocs-deviceset-1-1-bz7hk Bound pvc-ad02b6b9-259b-480f-99e3-0a5e83cbb886 512Gi RWO thin 7h55m ocs-deviceset-1-2-859cx Bound pvc-4750135b-793e-405c-a724-b6051755647a 512Gi RWO thin 7h52m ocs-deviceset-2-0-wk7b4 Bound pvc-050b84d3-5fb3-4d0a-b1a2-804e163e7484 512Gi RWO thin 7h59m ocs-deviceset-2-1-rqqj4 Bound pvc-e805dee5-b5ea-4979-afea-908a6aa80938 512Gi RWO thin 7h55m ocs-deviceset-2-2-85qft Bound pvc-f5af5fc0-2cd6-48da-8776-cd0803e20150 512Gi RWO thin 7h52m rook-ceph-mon-a Bound pvc-6965347e-00b2-496a-b4d8-1be3b8b92bc4 10Gi RWO thin 8h rook-ceph-mon-b Bound pvc-6f508b0e-19fe-408e-ace2-ed7960d0b828 10Gi RWO thin 8h rook-ceph-mon-c Bound pvc-e5b555e4-e213-4dbe-81db-38368950e839 10Gi RWO thin 8h
Thanks Ram. Per your findings, moving to 4.4
Nishanth, how can we proceed with the fix (of the calculation to be generic for any OSD size)? Should we move this BZ to OCP?
Hi Elad, Looks like duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1798164 Additionally making add-capacity modal generic does not sounds good, since it must comply with what's supported and the install flow. If we really take this issue seriously then we would rather add some fallback UI in 4.5+ that we skipped then considering that: 1.We want push minimal changes and make this feature possible to be pushed in OCP 4.3 as bug. 2.This scenario is highly unlikely to get encountered. I am not sure whether this can happen that some customer will apply custom storage cluster, if then we would take this as high severity.
@Elad, As we discussed yesterday, could you please verify this with supported OSD sizes and see if reproducible?
@Nishanth - We already did. See comments 6 and 7. My question in comment 9 was answered by Afreen in comment 10.
I am not sure why we need to keep this open. From https://bugzilla.redhat.com/show_bug.cgi?id=1816169#c6 its clear that this Bz is not observed if cluster is created/expanded with supported OSD size. This will remain as it is unless there is a change in requirements. I am closing this bug
The main reason for tracking the issue is for LSO based deployment where the deployment is done via CLI with a storagecluster yaml created by the user with the OSD size that fits his HW/instance.