Bug 1481306

Summary: [RFE] Pools: more verbose output in "ceph health detail" when pool is full - reach maximum value of quota
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tomas Petr <tpetr>
Component: RADOSAssignee: Greg Farnum <gfarnum>
Status: CLOSED DEFERRED QA Contact: Manohar Murthy <mmurthy>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.3CC: ceph-eng-bugs, dzafman, gfarnum, kchai, nojha, vumrao
Target Milestone: rcKeywords: FutureFeature
Target Release: 4.*   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-08 23:25:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomas Petr 2017-08-14 14:28:44 UTC
Description of problem:
[RFE] Pools: more verbose output in "ceph health detail" when pool is full - reach maximum value of quota

When the pool has set quota and reach this value, the health goes HEALTH_WARN and "pool '<pool_name>' is full" is shown in output "ceph -s" and "ceph health detail", but does not specify what quota is reach or the value.

The output of "ceph health detail" could show which pool quota has reached its maximum and the value, the similar way as it is shown in ceph.log and monitor log.


Version-Release number of selected component (if applicable):
ceph-mon-10.2.7-28.el7cp.x86_64

Comment 2 Tomas Petr 2017-08-14 14:30:41 UTC
 "pool is full" behavior reproducer

Create pool with quota set 100 objects and 1GB of space:
# ceph osd pool create testquotas 64 64 replicated 
# ceph osd pool set-quota testquotas max_objects 100
# ceph osd pool set-quota testquotas max_bytes 1073741824
# ceph osd dump | grep pool

# ceph osd dump | grep testquotas
pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 23 flags hashpspool max_bytes 1073741824 max_objects 100 stripe_width 0


- create test files
# for i in `seq 1 1024`; do dd if=/dev/zero of=file$i.img count=1 bs=1M; done

- test of reaching max_objects quota to exactly 100 objects
# for i in `seq 1 100`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done

# ceph -s
    cluster e5ae5b4b-da3c-49ee-8ec6-ae3cc57d150f
     health HEALTH_WARN
            pool 'testquotas' is full
     monmap e1: 3 mons at {mons-0=10.74.156.122:6789/0,mons-1=10.74.156.47:6789/0,mons-2=10.74.156.56:6789/0}
            election epoch 8, quorum 0,1,2 mons-1,mons-2,mons-0
     osdmap e26: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v10177: 176 pgs, 8 pools, 100 MB data, 271 objects
            421 MB used, 284 GB / 284 GB avail
                 176 active+clean


No other warning message in ceph.log or monitor log other than full state was reached
ceph.log
2017-08-14 08:49:00.624109 mon.0 10.74.156.47:6789/0 298 : cluster [WRN] pool 'testquotas' is full (reached quota's max_objects: 100)

ceph-mon.mons-1.log
2017-08-14 08:49:00.624107 7f194fc14700  0 log_channel(cluster) log [WRN] : pool 'testquotas' is full (reached quota's max_objects: 100)


# ceph osd dump | grep testquotas; ceph df | grep testquotas
pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 26 flags hashpspool,full max_bytes 1073741824 max_objects 100 stripe_width 0

    NAME                      ID     USED     %USED     MAX AVAIL     OBJECTS
    testquotas                7      102400k      0.10        97081M         100 

- command for uploading next objects hang paused
[root@mons-1 tmp]# for i in `seq 100 110`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done
2017-08-14 08:49:43.328420 7fbb5c69f8c0  0 client.4303.objecter  FULL, paused modify 0x561a1314e710 tid 0

- increase pool quota for max_objects
# ceph osd pool set-quota testquotas max_objects 10000
set-quota max_objects = 10000 for pool testquotas

ceph.log/ceph-mon.mons-1.log.log
2017-08-14 08:52:55.638024 mon.0 10.74.156.47:6789/0 363 : cluster [INF] pool 'testquotas' no longer full; removing FULL flag

- previous command finishes successfully without need of restart - got unpaused
[root@mons-1 tmp]# for i in `seq 100 110`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done
2017-08-14 08:49:43.328420 7fbb5c69f8c0  0 client.4303.objecter  FULL, paused modify 0x561a1314e710 tid 0
[root@mons-1 tmp]# ceph osd dump | grep testquotas; ceph df | grep testquotas
pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 28 flags hashpspool max_bytes 1073741824 max_objects 10000 stripe_width 0
    testquotas                7      110M      0.11        97071M         110 


- test of reaching max_bytes quota
[root@mons-1 tmp]# for i in `seq 110 1024`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done

- again no other warning than reached full
2017-08-14 08:55:15.651778 mon.0 10.74.156.47:6789/0 424 : cluster [WRN] pool 'testquotas' is full (reached quota's max_bytes: 1024M)

# ceph osd dump | grep testquotas; ceph df | grep testquotas
pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 29 flags hashpspool,full max_bytes 1073741824 max_objects 10000 stripe_width 0
    testquotas                7      1024M      1.05        96046M        1024 

# ceph -s
    cluster e5ae5b4b-da3c-49ee-8ec6-ae3cc57d150f
     health HEALTH_WARN
            pool 'testquotas' is full
     monmap e1: 3 mons at {mons-0=10.74.156.122:6789/0,mons-1=10.74.156.47:6789/0,mons-2=10.74.156.56:6789/0}
            election epoch 8, quorum 0,1,2 mons-1,mons-2,mons-0
     osdmap e29: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v10287: 176 pgs, 8 pools, 1024 MB data, 1195 objects
            3205 MB used, 281 GB / 284 GB avail
                 176 active+clean


# ceph health detail 
HEALTH_WARN pool 'testquotas' is full
pool 'testquotas' is full


- to upload another objects again command will get paused
# for i in `seq 1 24`; do rados -p testquotas put objecta$i /var/log/ceph/file$i.img; done
2017-08-14 08:56:31.671106 7f089d4498c0  0 client.4777.objecter  FULL, paused modify 0x558fccfc5500 tid 0


2017-08-14 08:55:45.197823 mon.0 10.74.156.47:6789/0 436 : cluster [INF] HEALTH_WARN; pool 'testquotas' is full

Comment 4 Giridhar Ramaraju 2019-08-05 13:09:20 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 5 Giridhar Ramaraju 2019-08-05 13:10:37 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri