2090338 – [Tracker for Ceph BZ #2096194] The data is not rebalanced between the disks after Ceph Cluster is Full

Bug 2090338 - [Tracker for Ceph BZ #2096194] The data is not rebalanced between the disks after Ceph Cluster is Full

Summary: [Tracker for Ceph BZ #2096194] The data is not rebalanced between the disks a...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ceph
Sub Component:
Version:	4.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Neha Ojha
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2096194
TreeView+	depends on / blocked

Reported:	2022-05-25 14:32 UTC by Oded
Modified:	2023-08-09 16:37 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	2096194 (view as bug list)
Environment:
Last Closed:	2022-07-11 18:05:16 UTC
Embargoed:

Attachments	(Terms of Use)

Description Oded 2022-05-25 14:32:00 UTC

Description of problem (please be detailed as possible and provide log
snippests):

1.I was able to delete the PVCs when the POOL = 100% and USED_CAPACITY = 85%, 
2.The PVs moved to released althought the RECLAIM_POLICY of ceph-rbd storage class is DELETE.
3.After deleting the PV manually, the data is not deleted
4.I used this KCS https://access.redhat.com/solutions/3001761 and changed the set-full-ratio to 97% [from 85%]
5.The relevant data deleted, however the data is not rebalanced between the disks 
[On disk2 the used capacity is 83% and on disk1 and disk0 used capacity=30%]

sh-4.4$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS
 2    hdd  0.50000   1.00000  512 GiB  429 GiB  428 GiB  113 KiB  918 MiB   83 GiB  83.70  1.75  145      up
 1    hdd  0.50000   1.00000  512 GiB  154 GiB  152 GiB  141 KiB  1.6 GiB  358 GiB  30.02  0.63  177      up
 0    hdd  0.50000   1.00000  512 GiB  154 GiB  152 GiB  141 KiB  1.6 GiB  358 GiB  30.02  0.63  177      up
                       TOTAL  1.5 TiB  736 GiB  732 GiB  397 KiB  4.1 GiB  800 GiB  47.91    

Version of all relevant components (if applicable):
ODF Version: 4.11.0-69
OCP Version: 4.11.0-0.nightly-2022-05-11-054135
OSD Size:512G
Num of disks:3
Provider: Vmware
sh-4.4$ ceph versions
{
    "mon": {
        "ceph version 16.2.7-112.el8cp (e18db2ff03ac60c64a18f3315c032b9d5a0a3b8f) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.7-112.el8cp (e18db2ff03ac60c64a18f3315c032b9d5a0a3b8f) pacific (stable)": 1
    },
    "osd": {
        "ceph version 16.2.7-112.el8cp (e18db2ff03ac60c64a18f3315c032b9d5a0a3b8f) pacific (stable)": 3
    },
    "mds": {
        "ceph version 16.2.7-112.el8cp (e18db2ff03ac60c64a18f3315c032b9d5a0a3b8f) pacific (stable)": 2
    },
    "rgw": {
        "ceph version 16.2.7-112.el8cp (e18db2ff03ac60c64a18f3315c032b9d5a0a3b8f) pacific (stable)": 1
    },
    "overall": {
        "ceph version 16.2.7-112.el8cp (e18db2ff03ac60c64a18f3315c032b9d5a0a3b8f) pacific (stable)": 10
    }
}


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Full fill 85% Disks with Benchmark Operator [10 FIO PODS+ 10 PVCs]  
https://github.com/Oded1990/odf-scripts/blob/main/interactive_scripts/run_benchmark_fio.py

2. Check Ceph df:
sh-4.4$ ceph df      
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    1.5 TiB  230 GiB  1.3 TiB   1.3 TiB      85.00
TOTAL  1.5 TiB  230 GiB  1.3 TiB   1.3 TiB      85.00
 
--- POOLS ---
POOL                                                   ID  PGS   STORED  OBJECTS     USED   %USED  MAX AVAIL
ocs-storagecluster-cephblockpool                        1   32  433 GiB  111.64k  1.3 TiB  100.00        0 B
device_health_metrics                                   2    1  4.2 KiB        3   13 KiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.log              3    8   46 KiB      340  2.0 MiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.control          4    8      0 B        8      0 B       0        0 B
ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec   5    8      0 B        0      0 B       0        0 B
ocs-storagecluster-cephobjectstore.rgw.buckets.index    6    8  8.3 KiB       22   25 KiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.meta             7    8   15 KiB       16  201 KiB  100.00        0 B
.rgw.root                                               8    8  4.9 KiB       16  180 KiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.buckets.data     9   32    1 KiB        1   12 KiB  100.00        0 B
ocs-storagecluster-cephfilesystem-metadata             10   32  3.1 MiB       25  9.5 MiB  100.00        0 B
ocs-storagecluster-cephfilesystem-data0                11   32  708 MiB      202  2.1 GiB  100.00        0 B

3. Delete all fio pods + pvcs [work as expected]
http://pastebin.test.redhat.com/1052896

4. Wait ~1H

5. Check Ceph status and osd df:
sh-4.4$ ceph df      
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    1.5 TiB  230 GiB  1.3 TiB   1.3 TiB      85.00
TOTAL  1.5 TiB  230 GiB  1.3 TiB   1.3 TiB      85.00
 
--- POOLS ---
POOL                                                   ID  PGS   STORED  OBJECTS     USED   %USED  MAX AVAIL
ocs-storagecluster-cephblockpool                        1   32  433 GiB  111.64k  1.3 TiB  100.00        0 B
device_health_metrics                                   2    1  4.2 KiB        3   13 KiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.log              3    8   46 KiB      340  2.0 MiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.control          4    8      0 B        8      0 B       0        0 B
ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec   5    8      0 B        0      0 B       0        0 B
ocs-storagecluster-cephobjectstore.rgw.buckets.index    6    8  8.3 KiB       22   25 KiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.meta             7    8   15 KiB       16  201 KiB  100.00        0 B
.rgw.root                                               8    8  4.9 KiB       16  180 KiB  100.00        0 B
ocs-storagecluster-cephobjectstore.rgw.buckets.data     9   32    1 KiB        1   12 KiB  100.00        0 B
ocs-storagecluster-cephfilesystem-metadata             10   32  3.1 MiB       25  9.5 MiB  100.00        0 B
ocs-storagecluster-cephfilesystem-data0                11   32  708 MiB      202  2.1 GiB  100.00        0 B

***********************************************************

sh-4.4$ ceph status
  cluster:
    id:     a28570c8-d885-4974-ba0a-f89964c007d6
    health: HEALTH_ERR
            1 backfillfull osd(s)
            2 full osd(s)
            11 pool(s) full
 
  services:
    mon: 3 daemons, quorum a,b,c (age 2d)
    mgr: a(active, since 2d)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 2d), 3 in (since 2d)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   11 pools, 177 pgs
    objects: 112.27k objects, 434 GiB
    usage:   1.3 TiB used, 230 GiB / 1.5 TiB avail
    pgs:     177 active+clean
 
  io:
    client:   853 B/s rd, 1 op/s rd, 0 op/s wr
 
*****************************************************************
sh-4.4$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS
 2    hdd  0.50000   1.00000  512 GiB  435 GiB  434 GiB   98 KiB  1.7 GiB   77 GiB  85.00  1.00  177      up
 1    hdd  0.50000   1.00000  512 GiB  435 GiB  434 GiB   98 KiB  1.6 GiB   77 GiB  85.00  1.00  177      up
 0    hdd  0.50000   1.00000  512 GiB  435 GiB  434 GiB   98 KiB  1.7 GiB   77 GiB  85.00  1.00  177      up
                       TOTAL  1.5 TiB  1.3 TiB  1.3 TiB  297 KiB  5.0 GiB  230 GiB  85.00                  
MIN/MAX VAR: 1.00/1.00  STDDEV: 0

6. Restart rook ceph operator pod
$ oc delete pod rook-ceph-operator-64cfc9c7df-47w97

7. Check Ceph status and osd df: [same results like step 5]

8.Check ceph-rbd sc and the RECLAIMPOLICY is "Delete"
$ oc get sc ocs-storagecluster-ceph-rbd
NAME                          PROVISIONER                          RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
ocs-storagecluster-ceph-rbd   openshift-storage.rbd.csi.ceph.com   Delete          Immediate           true                   3d

9.All old PVS were on the Released state  http://pastebin.test.redhat.com/1052965
$ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                                            STORAGECLASS                  REASON   AGE
pvc-183a828d-6dbf-4554-bc10-4a732fa58bcf   40Gi       RWO            Delete           Released   benchmark-operator/claim-10-16e9a170                             ocs-storagecluster-ceph-rbd            2d20h

10.Deleted all pvs manually [the pvs deleted]

11.Wait ~ 1H : The USED CAPACITY stuck on 85% and pools on 100% 
Why the PVs are not deleted if RECLAIMPOLICY=Delete?

12.Changed the set-full-ratio [from 0.85 to 0.97]
bash-4.4$ ceph osd set-full-ratio 0.97
osd set-full-ratio 0.97

13.Wait 1H

14.Relevant data deleted
sh-4.4$ ceph status
  cluster:
    id:     a28570c8-d885-4974-ba0a-f89964c007d6
    health: HEALTH_WARN
            1 backfillfull osd(s)
            Low space hindering backfill (add storage if this doesn't resolve itself): 32 pgs backfill_toofull
            Degraded data redundancy: 39294/119781 objects degraded (32.805%), 32 pgs degraded, 32 pgs undersized
            11 pool(s) backfillfull
 
  services:
    mon: 3 daemons, quorum a,b,c (age 28h)
    mgr: a(active, since 8d)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 25h), 3 in (since 8d); 32 remapped pgs
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   11 pools, 177 pgs
    objects: 39.93k objects, 153 GiB
    usage:   736 GiB used, 800 GiB / 1.5 TiB avail
    pgs:     39294/119781 objects degraded (32.805%)
             145 active+clean
             32  active+undersized+degraded+remapped+backfill_toofull
 
  io:
    client:   1.2 KiB/s rd, 15 KiB/s wr, 2 op/s rd, 1 op/s wr
 
  progress:
    Global Recovery Event (1h)
      [======================......] (remaining: 5h)
      
      
sh-4.4$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS
 2    hdd  0.50000   1.00000  512 GiB  429 GiB  428 GiB  113 KiB  918 MiB   83 GiB  83.70  1.75  145      up
 1    hdd  0.50000   1.00000  512 GiB  154 GiB  152 GiB  141 KiB  1.6 GiB  358 GiB  30.02  0.63  177      up
 0    hdd  0.50000   1.00000  512 GiB  154 GiB  152 GiB  141 KiB  1.6 GiB  358 GiB  30.02  0.63  177      up
                       TOTAL  1.5 TiB  736 GiB  732 GiB  397 KiB  4.1 GiB  800 GiB  47.91                  
MIN/MAX VAR: 0.63/1.75  STDDEV: 25.31

15.The data is not rebalanced after more than 48H.
On disk2 the used capacity is 83% and on disk1 and disk0 used capacity=30%


Actual results:
The data is not rebalanced

Expected results:
The data is balanced

Additional info:
ODF MG:
http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2090338/

Comment 2 Mudit Agarwal 2022-06-13 09:15:57 UTC

Not a 4.11 blocker, created a ceph tracker for better attention.

Comment 3 Vikhyat Umrao 2022-06-13 19:36:21 UTC

(In reply to Mudit Agarwal from comment #2)
> Not a 4.11 blocker, created a ceph tracker for better attention.

This is not a bug, the correct config option was not used provided the feedback in the Ceph bug[1], please test again and it should help.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2096194#c2

Comment 4 Vikhyat Umrao 2022-07-11 18:05:16 UTC

(In reply to Vikhyat Umrao from comment #3)
> (In reply to Mudit Agarwal from comment #2)
> > Not a 4.11 blocker, created a ceph tracker for better attention.
> 
> This is not a bug, the correct config option was not used provided the
> feedback in the Ceph bug[1], please test again and it should help.
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=2096194#c2

https://bugzilla.redhat.com/show_bug.cgi?id=2096194#c3 - closing this one as NOTABUG as it is working fine now with correct config option.

Note You need to log in before you can comment on or make changes to this bug.