+++ This bug was initially created as a clone of Bug #2212333 +++ +++ This bug was initially created as a clone of Bug #2211866 +++ Description of problem: ================================= I have a 4TiB cluster in Fusion aaS(agent based install) to which I have attached a single consumer cluster. Now, when I am running tests creating a dataset of 1.2 TiB from that consunmer, my test fails.. Here are my observations: >> 1. ceph health shows the blockpool quota is full ceph health detail HEALTH_WARN 1 pool(s) full [WRN] POOL_FULL: 1 pool(s) full pool 'cephblockpool-storageconsumer-54294405-cfae-4867-810d-1ff7290acf83-b5b8eee9' is full (running out of quota) >> 2.On checking the storageconsumer, I see we have granted capacity of 1TIB... Wasn't it supposed to be unlimited (approx 1PB)??? $oc get storageconsumer -n fusion-storage -o yaml spec: >> capacity: 1T enable: true status: cephResources: - kind: CephClient name: 74b7f702286c4ecf6c62197982adedfd status: Ready >> grantedCapacity: 1T lastHeartbeat: "2023-06-02T10:39:04Z" state: Ready The quote per consumer was removed even from deployer based installs of current RH ODF MS, hence the ocs-client operator setting it to default of 1TB is incorrect since the rest of the capacity of provider is unutilized. Considering the same fo 20TB cluster, each consumer bein able to use only 1 TB is not as per expected behavior. Version-Release number of selected component (if applicable): =============================================================== Consumer ======== oc get csv -n fusion-storage NAME DISPLAY VERSION REPLACES PHASE managed-fusion-agent.v2.0.11 Managed Fusion Agent 2.0.11 Succeeded observability-operator.v0.0.21 Observability Operator 0.0.21 observability-operator.v0.0.20 Succeeded ocs-client-operator.v4.12.3-rhodf OpenShift Data Foundation Client 4.12.3-rhodf Succeeded odf-csi-addons-operator.v4.12.3-rhodf CSI Addons 4.12.3-rhodf odf-csi-addons-operator.v4.12.2-rhodf Succeeded ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 Succeeded provider ============ oc get csv -n fusion-storage NAME DISPLAY VERSION REPLACES PHASE managed-fusion-agent.v2.0.11 Managed Fusion Agent 2.0.11 Succeeded observability-operator.v0.0.21 Observability Operator 0.0.21 observability-operator.v0.0.20 Succeeded ocs-operator.v4.12.3-rhodf OpenShift Container Storage 4.12.3-rhodf ocs-operator.v4.12.2-rhodf Succeeded ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 Succeeded route-monitor-operator.v0.1.500-6152b76 Route Monitor Operator 0.1.500-6152b76 route-monitor-operator.v0.1.498-e33e391 Succeeded How reproducible: ===================== Always Steps to Reproduce: ====================== 1.Create a provider consumer cluster in Fusion aaS following the document [1] [1] https://docs.google.com/document/d/1Jdx8czlMjbumvilw8nZ6LtvWOMAx3H4TfwoVwiBs0nE/edit# 2. Check the requestedCapacity is incorrcectly set to 1TB fo the storageconsumer via the ocs-client-operator 3. Actual results: ================== Each consumer)ocs-storageclient) is able to use only 1TB of the provider usable space Expected results: =================== No quota should be set per consumer/ocs-storage-client Additional info: ======================= apiVersion: misf.ibm.com/v1alpha1 kind: ManagedFusionOffering metadata: name: managedfusionoffering-sample namespace: fusion-storage spec: kind: DFC release: "4.12" config: | onboardingTicket: <ticket> providerEndpoint: XXXXX:31659 provider cluster: id: 3aad2a98-c8ff-433a-86e8-f78f1bdb98be health: HEALTH_WARN 1 pool(s) full services: mon: 3 daemons, quorum a,b,c (age 30h) mgr: a(active, since 30h) mds: 1/1 daemons up, 1 hot standby osd: 3 osds: 3 up (since 30h), 3 in (since 30h) data: volumes: 1/1 healthy pools: 4 pools, 577 pgs objects: 239.46k objects, 934 GiB usage: 2.7 TiB used, 9.3 TiB / 12 TiB avail pgs: 577 active+clean io: client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 2 ssd 4.00000 1.00000 4 TiB 934 GiB 932 GiB 39 KiB 1.9 GiB 3.1 TiB 22.81 1.00 577 up 0 ssd 4.00000 1.00000 4 TiB 934 GiB 932 GiB 39 KiB 1.9 GiB 3.1 TiB 22.81 1.00 577 up 1 ssd 4.00000 1.00000 4 TiB 934 GiB 932 GiB 39 KiB 1.9 GiB 3.1 TiB 22.81 1.00 577 up TOTAL 12 TiB 2.7 TiB 2.7 TiB 118 KiB 5.7 GiB 9.3 TiB 22.81 MIN/MAX VAR: 1.00/1.00 STDDEV: 0 $ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED ssd 12 TiB 9.3 TiB 2.7 TiB 2.7 TiB 22.81 TOTAL 12 TiB 9.3 TiB 2.7 TiB 2.7 TiB 22.81 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 15 KiB 6 46 KiB 0 2.5 TiB ocs-storagecluster-cephfilesystem-metadata 2 32 16 KiB 22 131 KiB 0 2.5 TiB ocs-storagecluster-cephfilesystem-ssd 3 512 0 B 0 0 B 0 2.5 TiB cephblockpool-storageconsumer-54294405-cfae-4867-810d-1ff7290acf83-b5b8eee9 4 32 932 GiB 239.43k 2.7 TiB 26.79 2.5 TiB --- Additional comment from Shekhar Berry on 2023-06-02 11:50:27 UTC --- Must-gather not added as it is a known bug to the dev engineering team. https://github.com/red-hat-storage/ocs-client-operator/blob/main/controllers/storageclient_controller.go#L272-L279 Let me know otherwise --- Additional comment from RHEL Program Management on 2023-06-02 13:09:06 UTC --- This bug having no release flag set previously, is now set with release flag 'odf‑4.13.0' to '?', and so is being proposed to be fixed at the ODF 4.13.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag. --- Additional comment from RHEL Program Management on 2023-06-02 13:09:06 UTC --- Since this bug has severity set to 'urgent', it is being proposed as a blocker for the currently set release flag. Please resolve ASAP. --- Additional comment from RHEL Program Management on 2023-06-05 10:32:33 UTC --- This bug having no release flag set previously, is now set with release flag 'odf‑4.13.0' to '?', and so is being proposed to be fixed at the ODF 4.13.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag. --- Additional comment from RHEL Program Management on 2023-06-05 10:32:33 UTC --- Since this bug has severity set to 'urgent', it is being proposed as a blocker for the currently set release flag. Please resolve ASAP. --- Additional comment from Jilju Joy on 2023-06-07 07:09:03 UTC --- Logs(from a differnt cluster): Provider: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-jn6-pr/jijoy-jn6-pr_20230607T030415/logs/failed_testcase_ocs_logs_1686119226/test_deployment_ocs_logs/jijoy-jn6-pr/ Consumer: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-jn6-c1/jijoy-jn6-c1_20230607T030410/logs/testcases_1686119185/jijoy-jn6-c1/ --- Additional comment from RHEL Program Management on 2023-06-28 13:25:49 UTC --- This BZ is being approved for an ODF 4.12.z z-stream update, upon receipt of the 3 ACKs (PM,Devel,QA) for the release flag 'odf‑4.12.z', and having been marked for an approved z-stream update --- Additional comment from RHEL Program Management on 2023-06-28 13:25:49 UTC --- Since this bug has been approved for ODF 4.12.5 release, through release flag 'odf-4.12.z+', and appropriate update number entry at the 'Internal Whiteboard', the Target Release is being set to 'ODF 4.12.5' --- Additional comment from Sunil Kumar Acharya on 2023-07-05 03:51:55 UTC --- Please backport the fix to ODF-4.12 and update the RDT flag/text appropriately. --- Additional comment from Ritesh Chikatwar on 2023-07-06 09:29:43 UTC --- Still, the Development of the bug is in progress hence moving the state of the bug to assigned. --- Additional comment from Red Hat Bugzilla on 2023-08-03 08:29:58 UTC --- Account disabled by LDAP Audit