Description of problem: [RFE] Pools: more verbose output in "ceph health detail" when pool is full - reach maximum value of quota When the pool has set quota and reach this value, the health goes HEALTH_WARN and "pool '<pool_name>' is full" is shown in output "ceph -s" and "ceph health detail", but does not specify what quota is reach or the value. The output of "ceph health detail" could show which pool quota has reached its maximum and the value, the similar way as it is shown in ceph.log and monitor log. Version-Release number of selected component (if applicable): ceph-mon-10.2.7-28.el7cp.x86_64
"pool is full" behavior reproducer Create pool with quota set 100 objects and 1GB of space: # ceph osd pool create testquotas 64 64 replicated # ceph osd pool set-quota testquotas max_objects 100 # ceph osd pool set-quota testquotas max_bytes 1073741824 # ceph osd dump | grep pool # ceph osd dump | grep testquotas pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 23 flags hashpspool max_bytes 1073741824 max_objects 100 stripe_width 0 - create test files # for i in `seq 1 1024`; do dd if=/dev/zero of=file$i.img count=1 bs=1M; done - test of reaching max_objects quota to exactly 100 objects # for i in `seq 1 100`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done # ceph -s cluster e5ae5b4b-da3c-49ee-8ec6-ae3cc57d150f health HEALTH_WARN pool 'testquotas' is full monmap e1: 3 mons at {mons-0=10.74.156.122:6789/0,mons-1=10.74.156.47:6789/0,mons-2=10.74.156.56:6789/0} election epoch 8, quorum 0,1,2 mons-1,mons-2,mons-0 osdmap e26: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v10177: 176 pgs, 8 pools, 100 MB data, 271 objects 421 MB used, 284 GB / 284 GB avail 176 active+clean No other warning message in ceph.log or monitor log other than full state was reached ceph.log 2017-08-14 08:49:00.624109 mon.0 10.74.156.47:6789/0 298 : cluster [WRN] pool 'testquotas' is full (reached quota's max_objects: 100) ceph-mon.mons-1.log 2017-08-14 08:49:00.624107 7f194fc14700 0 log_channel(cluster) log [WRN] : pool 'testquotas' is full (reached quota's max_objects: 100) # ceph osd dump | grep testquotas; ceph df | grep testquotas pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 26 flags hashpspool,full max_bytes 1073741824 max_objects 100 stripe_width 0 NAME ID USED %USED MAX AVAIL OBJECTS testquotas 7 102400k 0.10 97081M 100 - command for uploading next objects hang paused [root@mons-1 tmp]# for i in `seq 100 110`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done 2017-08-14 08:49:43.328420 7fbb5c69f8c0 0 client.4303.objecter FULL, paused modify 0x561a1314e710 tid 0 - increase pool quota for max_objects # ceph osd pool set-quota testquotas max_objects 10000 set-quota max_objects = 10000 for pool testquotas ceph.log/ceph-mon.mons-1.log.log 2017-08-14 08:52:55.638024 mon.0 10.74.156.47:6789/0 363 : cluster [INF] pool 'testquotas' no longer full; removing FULL flag - previous command finishes successfully without need of restart - got unpaused [root@mons-1 tmp]# for i in `seq 100 110`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done 2017-08-14 08:49:43.328420 7fbb5c69f8c0 0 client.4303.objecter FULL, paused modify 0x561a1314e710 tid 0 [root@mons-1 tmp]# ceph osd dump | grep testquotas; ceph df | grep testquotas pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 28 flags hashpspool max_bytes 1073741824 max_objects 10000 stripe_width 0 testquotas 7 110M 0.11 97071M 110 - test of reaching max_bytes quota [root@mons-1 tmp]# for i in `seq 110 1024`; do rados -p testquotas put object$i /var/log/ceph/file$i.img; done - again no other warning than reached full 2017-08-14 08:55:15.651778 mon.0 10.74.156.47:6789/0 424 : cluster [WRN] pool 'testquotas' is full (reached quota's max_bytes: 1024M) # ceph osd dump | grep testquotas; ceph df | grep testquotas pool 7 'testquotas' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 29 flags hashpspool,full max_bytes 1073741824 max_objects 10000 stripe_width 0 testquotas 7 1024M 1.05 96046M 1024 # ceph -s cluster e5ae5b4b-da3c-49ee-8ec6-ae3cc57d150f health HEALTH_WARN pool 'testquotas' is full monmap e1: 3 mons at {mons-0=10.74.156.122:6789/0,mons-1=10.74.156.47:6789/0,mons-2=10.74.156.56:6789/0} election epoch 8, quorum 0,1,2 mons-1,mons-2,mons-0 osdmap e29: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v10287: 176 pgs, 8 pools, 1024 MB data, 1195 objects 3205 MB used, 281 GB / 284 GB avail 176 active+clean # ceph health detail HEALTH_WARN pool 'testquotas' is full pool 'testquotas' is full - to upload another objects again command will get paused # for i in `seq 1 24`; do rados -p testquotas put objecta$i /var/log/ceph/file$i.img; done 2017-08-14 08:56:31.671106 7f089d4498c0 0 client.4777.objecter FULL, paused modify 0x558fccfc5500 tid 0 2017-08-14 08:55:45.197823 mon.0 10.74.156.47:6789/0 436 : cluster [INF] HEALTH_WARN; pool 'testquotas' is full
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri