Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2096194

Summary: The data is not rebalanced between the disks after Ceph Cluster is Full
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Mudit Agarwal <muagarwa>
Component: RADOSAssignee: Neha Ojha <nojha>
Status: CLOSED NOTABUG QA Contact: Pawan <pdhiran>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.2CC: akupczyk, amathuri, bhubbard, bniver, ceph-eng-bugs, choffman, ksirivad, lflores, madam, muagarwa, nberry, nojha, ocs-bugs, oviner, owasserm, pdhange, rfriedma, rzarzyns, skanta, sostapov, sseshasa, tdesala, vumrao
Target Milestone: ---   
Target Release: 6.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2090338 Environment:
Last Closed: 2022-07-11 18:04:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2090338    
Bug Blocks:    

Comment 3 Oded 2022-07-11 07:47:21 UTC
Vikhyat, I tested your procedure and it is working as expected.

SetUp:
ODF Version: 4.11.0-110
OCP Version: 4.11.0-0.nightly-2022-07-06-062815
LSO Version: local-storage-operator.4.11.0-202206250809
Provider: Vmware

Test Process:
1. Full fill the cluster with benchmark operator:
https://github.com/cloud-bulldozer/benchmark-operator

2. Check Ceph Status and ceph df:
sh-4.4$ ceph df
--- RAW STORAGE ---
CLASS     SIZE   AVAIL     USED  RAW USED  %RAW USED
hdd    300 GiB  46 GiB  254 GiB   254 GiB      84.62
TOTAL  300 GiB  46 GiB  254 GiB   254 GiB      84.62
 
--- POOLS ---
POOL                                                   ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
ocs-storagecluster-cephblockpool                        1   32   84 GiB   21.92k  253 GiB  99.58    366 MiB
device_health_metrics                                   2    1      0 B        0      0 B      0    366 MiB
ocs-storagecluster-cephobjectstore.rgw.control          3   32      0 B        8      0 B      0    366 MiB
ocs-storagecluster-cephobjectstore.rgw.buckets.index    4   32      0 B       22      0 B      0    366 MiB
ocs-storagecluster-cephobjectstore.rgw.log              5   32   34 KiB      308  1.9 MiB   0.18    366 MiB
ocs-storagecluster-cephobjectstore.rgw.meta             6   32  7.7 KiB       16  180 KiB   0.02    366 MiB
.rgw.root                                               7   32  4.8 KiB       16  180 KiB   0.02    366 MiB
ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec   8   32      0 B        0      0 B      0    366 MiB
ocs-storagecluster-cephfilesystem-metadata              9   32   23 KiB       22  159 KiB   0.01    366 MiB
ocs-storagecluster-cephfilesystem-data0                10   32      0 B        0      0 B      0    366 MiB
ocs-storagecluster-cephobjectstore.rgw.buckets.data    11   32    1 KiB        1   12 KiB      0    366 MiB


sh-4.4$ ceph status
  cluster:
    id:     49534fd9-abd6-4892-b493-72c20614d854
    health: HEALTH_WARN
            3 backfillfull osd(s)
            11 pool(s) backfillfull
 
  services:
    mon: 3 daemons, quorum a,b,c (age 4h)
    mgr: a(active, since 4h)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 4h), 3 in (since 4h)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   11 pools, 321 pgs
    objects: 22.32k objects, 84 GiB
    usage:   254 GiB used, 46 GiB / 300 GiB avail
    pgs:     321 active+clean
 
  io:
    client:   1.2 KiB/s rd, 6.3 KiB/s wr, 2 op/s rd, 0 op/s wr
  
 sh-4.4$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL   %USE   VAR   PGS  STATUS
 0    hdd  0.09769   1.00000  100 GiB   85 GiB   84 GiB   83 KiB  402 MiB  15 GiB  84.60  1.00  321      up
 1    hdd  0.09769   1.00000  100 GiB   85 GiB   84 GiB   83 KiB  452 MiB  15 GiB  84.65  1.00  321      up
 2    hdd  0.09769   1.00000  100 GiB   85 GiB   84 GiB   83 KiB  455 MiB  15 GiB  84.65  1.00  321      up
                       TOTAL  300 GiB  254 GiB  253 GiB  251 KiB  1.3 GiB  46 GiB  84.63                  
MIN/MAX VAR: 1.00/1.00  STDDEV: 0.02
 
 3. Check backfillfull_ratio:
sh-4.4$ ceph osd dump | grep backfillfull_ratio
backfillfull_ratio 0.8

4. Change backfillfull_ratio to 0.95:
sh-4.4$ ceph osd set-backfillfull-ratio 0.95
osd set-backfillfull-ratio 0.95
sh-4.4$ ceph osd dump | grep backfillfull_ratio
backfillfull_ratio 0.95
 
 5. Delete data
 
 6. Check ceph status:
 sh-4.4$ ceph status
  cluster:
    id:     49534fd9-abd6-4892-b493-72c20614d854
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 4h)
    mgr: a(active, since 4h)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 4h), 3 in (since 4h)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    
 7.   Change backfillfull_ratio to 0.8:
sh-4.4$ ceph osd set-backfillfull-ratio 0.80
osd set-backfillfull-ratio 0.8
sh-4.4$ ceph osd dump | grep backfillfull_ratio
backfillfull_ratio 0.8

Comment 4 Vikhyat Umrao 2022-07-11 18:04:15 UTC
okay cool! closing this one as NOTABUG.