Description of problem: We are observing that cluster is displaying health warning, stating that PGs in the cluster are in backfill_toofull state. But none of the OSDs are filled beyond 50%. # ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 3 hdd 0.40619 1.00000 416 GiB 212 GiB 196 GiB 178 MiB 1.3 GiB 204 GiB 50.90 1.22 55 up 6 hdd 0.40619 1.00000 416 GiB 136 GiB 120 GiB 50 MiB 923 MiB 280 GiB 32.65 0.78 27 up 10 hdd 0.40619 1.00000 416 GiB 157 GiB 141 GiB 41 MiB 1.0 GiB 259 GiB 37.86 0.91 33 up 15 hdd 0.40619 1.00000 416 GiB 212 GiB 196 GiB 18 MiB 1.3 GiB 204 GiB 50.87 1.22 51 up 19 hdd 0.40619 1.00000 416 GiB 135 GiB 119 GiB 232 MiB 1.0 GiB 281 GiB 32.52 0.78 35 up 23 hdd 0.40619 1.00000 416 GiB 125 GiB 109 GiB 33 KiB 683 MiB 291 GiB 30.09 0.72 34 up 27 hdd 0.40619 1.00000 416 GiB 191 GiB 175 GiB 2.9 MiB 1.2 GiB 225 GiB 45.87 1.10 49 up 31 hdd 0.40619 1.00000 416 GiB 158 GiB 142 GiB 222 MiB 1.2 GiB 258 GiB 37.88 0.91 45 up 35 hdd 0.40619 1.00000 416 GiB 227 GiB 211 GiB 9.6 MiB 1.9 GiB 189 GiB 54.56 1.30 52 up 0 hdd 0.40619 1.00000 416 GiB 146 GiB 130 GiB 274 MiB 1.1 GiB 270 GiB 35.16 0.84 39 up 4 hdd 0.40619 1.00000 416 GiB 179 GiB 163 GiB 117 MiB 1.2 GiB 237 GiB 43.01 1.03 49 up 8 hdd 0.40619 1.00000 416 GiB 136 GiB 120 GiB 153 MiB 998 MiB 280 GiB 32.61 0.78 39 up 12 hdd 0.40619 1.00000 416 GiB 168 GiB 152 GiB 1 KiB 951 MiB 248 GiB 40.45 0.97 37 up 16 hdd 0.40619 1.00000 416 GiB 169 GiB 153 GiB 31 MiB 1.1 GiB 247 GiB 40.69 0.97 43 up 20 hdd 0.40619 1.00000 416 GiB 219 GiB 203 GiB 29 MiB 1.7 GiB 197 GiB 52.53 1.26 54 up 24 hdd 0.40619 1.00000 416 GiB 190 GiB 174 GiB 101 MiB 1.1 GiB 226 GiB 45.60 1.09 45 up 28 hdd 0.40619 1.00000 416 GiB 125 GiB 109 GiB 90 MiB 776 MiB 291 GiB 30.14 0.72 25 up 32 hdd 0.40619 1.00000 416 GiB 212 GiB 196 GiB 186 MiB 1.4 GiB 204 GiB 50.91 1.22 52 up 2 hdd 0.40619 1.00000 416 GiB 201 GiB 185 GiB 81 MiB 1.3 GiB 215 GiB 48.32 1.16 52 up 7 hdd 0.40619 1.00000 416 GiB 136 GiB 120 GiB 4.7 MiB 851 MiB 280 GiB 32.69 0.78 38 up 11 hdd 0.40619 1.00000 416 GiB 230 GiB 214 GiB 1 KiB 2.1 GiB 186 GiB 55.26 1.32 51 up 14 hdd 0.40619 1.00000 416 GiB 125 GiB 109 GiB 5.1 MiB 799 MiB 291 GiB 29.94 0.72 25 up 18 hdd 0.40619 1.00000 416 GiB 190 GiB 174 GiB 117 MiB 1.3 GiB 226 GiB 45.56 1.09 52 up 22 hdd 0.40619 1.00000 416 GiB 125 GiB 109 GiB 77 MiB 822 MiB 291 GiB 30.00 0.72 34 up 26 hdd 0.40619 1.00000 416 GiB 201 GiB 185 GiB 21 MiB 1.1 GiB 215 GiB 48.21 1.15 44 up 30 hdd 0.40619 1.00000 416 GiB 169 GiB 153 GiB 190 MiB 1.2 GiB 247 GiB 40.61 0.97 45 up 34 hdd 0.40619 1.00000 416 GiB 179 GiB 163 GiB 53 MiB 1.2 GiB 237 GiB 43.10 1.03 48 up 1 hdd 0.40619 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 5 hdd 0.40619 1.00000 416 GiB 205 GiB 189 GiB 112 MiB 1.4 GiB 211 GiB 49.33 1.18 56 up 9 hdd 0.40619 1.00000 416 GiB 127 GiB 111 GiB 64 MiB 1.1 GiB 289 GiB 30.43 0.73 44 up 13 hdd 0.40619 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 17 hdd 0.40619 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 21 hdd 0.40619 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 25 hdd 0.40619 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 29 hdd 0.40619 1.00000 416 GiB 194 GiB 178 GiB 266 MiB 1.5 GiB 222 GiB 46.58 1.11 56 up 33 hdd 0.40619 1.00000 416 GiB 216 GiB 200 GiB 56 MiB 1.6 GiB 200 GiB 51.84 1.24 58 up TOTAL 13 TiB 5.3 TiB 4.8 TiB 2.7 GiB 37 GiB 7.3 TiB 41.81 [root@argo012 ~]# ceph pg dump| grep backfill_toofull 6.fa 12033 0 12033 0 0 11661737984 0 0 5126 5126 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.629659+0000 15495'32562 15495:112019 [2,33,28,10] 2 [2,NONE,28,10] 2 15282'30407 2023-08-09T22:37:38.489306+0000 13086'26911 2023-08-08T21:09:30.273125+0000 0 99 queued for scrub 22043 0 6.f2 12172 0 12172 0 0 11571953664 0 0 5174 5174 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.368370+0000 15495'33236 15495:114972 [5,22,3,8] 5 [NONE,22,3,8] 22 15271'29802 2023-08-09T15:59:35.735286+0000 15271'29802 2023-08-09T15:59:35.735286+0000 0 698 queued for scrub 21064 0 6.e8 12182 0 12182 0 0 11750047744 0 0 5095 5095 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.811080+0000 15495'33273 15495:111295 [30,27,16,5] 30 [30,27,16,NONE] 30 15273'29812 2023-08-09T16:14:33.353209+0000 11370'27174 2023-08-08T12:02:03.840314+0000 0 106 queued for scrub 21064 0 6.d1 12113 0 12113 0 0 11579490304 0 0 4690 4690 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.973076+0000 15495'33014 15495:77594 [22,10,5,32] 22 [22,10,NONE,32] 22 15281'30514 2023-08-09T21:04:39.171251+0000 11843'26759 2023-08-08T13:36:55.184941+0000 0 96 queued for scrub 21891 0 6.c1 12105 0 12105 0 0 11783143424 0 0 4456 4456 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.175027+0000 15495'32053 15495:86571 [5,11,15,20] 5 [NONE,11,15,20] 11 15234'28481 2023-08-09T14:40:23.005390+0000 2928'25276 2023-08-05T15:47:01.820495+0000 0 111 queued for scrub 20770 0 6.a7 12164 0 12164 0 0 11856150528 0 0 4176 4176 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.441079+0000 15495'31787 15495:78543 [11,23,24,33] 11 [11,23,24,NONE] 11 14672'27334 2023-08-09T07:59:38.108345+0000 14672'27334 2023-08-09T07:59:38.108345+0000 0 327 queued for scrub 9874 0 6.4 12260 0 12260 0 0 11732123648 0 0 3794 3794 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.972959+0000 15495'31479 15495:105034 [5,22,12,3] 5 [NONE,22,12,3] 22 14835'27336 2023-08-09T09:48:34.527402+0000 7419'25161 2023-08-06T00:49:04.630245+0000 0 54 queued for scrub 10031 0 6.25 12100 0 12100 0 0 11746050048 0 0 4028 4028 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.683270+0000 15495'31974 15495:117496 [14,35,24,33] 14 [14,35,24,NONE] 14 15285'30354 2023-08-10T01:32:02.219802+0000 15285'30354 2023-08-10T01:32:02.219802+0000 0 597 queued for scrub 22697 0 6.27 12162 0 12162 0 0 11862376448 0 0 4170 4170 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.128745+0000 15495'31781 15495:78538 [11,23,24,33] 11 [11,23,24,NONE] 11 14672'27334 2023-08-09T07:59:38.108345+0000 14672'27334 2023-08-09T07:59:38.108345+0000 0 327 queued for scrub 9874 0 6.2f 12055 0 12055 0 0 11575820288 0 0 4190 4190 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.230112+0000 15495'32299 15495:83449 [7,5,10,28] 7 [7,NONE,10,28] 7 15288'30877 2023-08-10T02:48:00.901927+0000 15288'30877 2023-08-10T02:48:00.901927+0000 0 532 queued for scrub 22821 0 6.41 12113 0 12113 0 0 11782094848 0 0 4469 4469 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.208852+0000 15495'32066 15495:86585 [5,11,15,20] 5 [NONE,11,15,20] 11 15234'28481 2023-08-09T14:40:23.005390+0000 2928'25276 2023-08-05T15:47:01.820495+0000 0 111 queued for scrub 20770 0 6.51 12118 0 12118 0 0 11602362368 0 0 4698 4698 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.671645+0000 15495'33022 15495:77603 [22,10,5,32] 22 [22,10,NONE,32] 22 15281'30514 2023-08-09T21:04:39.171251+0000 11843'26759 2023-08-08T13:36:55.184941+0000 0 96 queued for scrub 21891 0 6.68 12183 0 12183 0 0 11748999168 0 0 5097 5097 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.830281+0000 15495'33275 15495:111298 [30,27,16,5] 30 [30,27,16,NONE] 30 15273'29812 2023-08-09T16:14:33.353209+0000 11370'27174 2023-08-08T12:02:03.840314+0000 0 106 queued for scrub 21064 0 6.72 12178 0 12178 0 0 11592007680 0 0 5179 5179 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:17.973003+0000 15495'33241 15495:114978 [5,22,3,8] 5 [NONE,22,3,8] 22 15271'29802 2023-08-09T15:59:35.735286+0000 15271'29802 2023-08-09T15:59:35.735286+0000 0 698 queued for scrub 21064 0 6.7a 12027 0 12027 0 0 11654266880 0 0 5120 5120 active+undersized+degraded+remapped+backfill_toofull 2023-08-10T09:20:18.451102+0000 15495'32556 15495:112014 [2,33,28,10] 2 [2,NONE,28,10] 2 15282'30407 2023-08-09T22:37:38.489306+0000 13086'26911 2023-08-08T21:09:30.273125+0000 0 99 queued for scrub 22043 0 [root@argo012 ~]# [root@argo012 ~]# ceph -s cluster: id: 66070a80-2f84-11ee-bc2c-0cc47af3ea56 health: HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 15 pgs backfill_toofull Degraded data redundancy: 1603314/12392491 objects degraded (12.938%), 140 pgs degraded, 62 pgs undersized services: mon: 3 daemons, quorum argo012,argo013,argo014 (age 4h) mgr: argo013.akdhka(active, since 4h), standbys: argo014.xfhnzv, argo012.odttqx osd: 36 osds: 31 up (since 10m), 31 in (since 54s); 148 remapped pgs rgw: 4 daemons active (4 hosts, 1 zones) data: pools: 7 pools, 417 pgs objects: 3.10M objects, 2.7 TiB usage: 5.3 TiB used, 7.3 TiB / 13 TiB avail pgs: 1603314/12392491 objects degraded (12.938%) 187344/12392491 objects misplaced (1.512%) 269 active+clean 64 active+undersized+degraded+remapped+backfilling 61 active+undersized+degraded+remapped+backfill_wait 15 active+undersized+degraded+remapped+backfill_toofull 7 active+remapped+backfilling 1 active+remapped+backfill_wait io: client: 18 KiB/s rd, 3.8 MiB/s wr, 25 op/s rd, 43 op/s wr recovery: 204 MiB/s, 224 objects/s progress: Global Recovery Event (10m) [==================..........] (remaining: 5m) [root@argo012 ~]# ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 13 TiB 7.3 TiB 5.3 TiB 5.3 TiB 41.81 TOTAL 13 TiB 7.3 TiB 5.3 TiB 5.3 TiB 41.81 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL .mgr 1 1 11 MiB 3 32 MiB 0 1.9 TiB .rgw.root 2 32 1.3 KiB 4 48 KiB 0 1.9 TiB default.rgw.log 3 32 4.3 KiB 209 464 KiB 0 2.0 TiB default.rgw.control 4 32 0 B 9 0 B 0 1.9 TiB default.rgw.meta 5 32 9.8 KiB 18 233 KiB 0 1.9 TiB ec22-pool 6 256 3.2 TiB 3.10M 5.5 TiB 48.58 3.3 TiB default.rgw.buckets.index 7 32 1020 MiB 90 2.9 GiB 0.05 2.0 TiB Version-Release number of selected component (if applicable): [root@argo012 ~]# ceph health detail HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 15 pgs backfill_toofull; Degraded data redundancy: 1600982/12392683 objects degraded (12.919%), 140 pgs degraded, 140 pgs undersized [WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if this doesn't resolve itself): 15 pgs backfill_toofull pg 6.4 is active+undersized+degraded+remapped+backfill_toofull, acting [NONE,22,12,3] pg 6.25 is active+undersized+degraded+remapped+backfill_toofull, acting [14,35,24,NONE] pg 6.27 is active+undersized+degraded+remapped+backfill_toofull, acting [11,23,24,NONE] pg 6.2f is active+undersized+degraded+remapped+backfill_toofull, acting [7,NONE,10,28] pg 6.41 is active+undersized+degraded+remapped+backfill_toofull, acting [NONE,11,15,20] pg 6.51 is active+undersized+degraded+remapped+backfill_toofull, acting [22,10,NONE,32] pg 6.68 is active+undersized+degraded+remapped+backfill_toofull, acting [30,27,16,NONE] pg 6.72 is active+undersized+degraded+remapped+backfill_toofull, acting [NONE,22,3,8] pg 6.7a is active+undersized+degraded+remapped+backfill_toofull, acting [2,NONE,28,10] pg 6.a7 is active+undersized+degraded+remapped+backfill_toofull, acting [11,23,24,NONE] pg 6.c1 is active+undersized+degraded+remapped+backfill_toofull, acting [NONE,11,15,20] pg 6.d1 is active+undersized+degraded+remapped+backfill_toofull, acting [22,10,NONE,32] pg 6.e8 is active+undersized+degraded+remapped+backfill_toofull, acting [30,27,16,NONE] pg 6.f2 is active+undersized+degraded+remapped+backfill_toofull, acting [NONE,22,3,8] pg 6.fa is active+undersized+degraded+remapped+backfill_toofull, acting [2,NONE,28,10] # ceph osd dump epoch 15495 fsid 66070a80-2f84-11ee-bc2c-0cc47af3ea56 created 2023-07-31T09:27:20.333211+0000 modified 2023-08-10T09:08:48.391487+0000 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 169 full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85 require_min_compat_client luminous min_compat_client luminous require_osd_release quincy stretch_mode_enabled false pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 61 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 113 lfor 0/0/109 flags hashpspool stripe_width 0 application rgw pool 3 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 198 lfor 0/0/109 flags hashpspool stripe_width 0 application rgw pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 113 lfor 0/0/111 flags hashpspool stripe_width 0 application rgw pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 113 lfor 0/0/111 flags hashpspool stripe_width 0 pg_autoscale_bias 4 application rgw pool 6 'ec22-pool' erasure profile ec22 size 4 min_size 3 crush_rule 1 object_hash rjenkins pg_num 256 pgp_num 128 pg_num_target 512 pgp_num_target 512 autoscale_mode on last_change 15465 lfor 0/15233/15465 flags hashpspool stripe_width 8192 application rgw pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 418 lfor 0/0/416 flags hashpspool stripe_width 0 pg_autoscale_bias 4 application rgw max_osd 36 osd.0 up in weight 1 up_from 15372 up_thru 15492 down_at 15364 last_clean_interval [329,15363) [v2:10.8.128.213:6800/1960352398,v1:10.8.128.213:6801/1960352398] [v2:10.8.128.213:6802/1960352398,v1:10.8.128.213:6803/1960352398] exists,up 6bfdc645-9cf9-4c5f-8227-621468bcd219 osd.1 down out weight 0 up_from 15421 up_thru 15421 down_at 15454 last_clean_interval [12539,15415) [v2:10.8.128.217:6800/2045021352,v1:10.8.128.217:6801/2045021352] [v2:10.8.128.217:6802/2045021352,v1:10.8.128.217:6803/2045021352] autoout,exists b631d67e-5687-472a-81f2-8060f8ac7f11 osd.2 up in weight 1 up_from 15400 up_thru 15492 down_at 15391 last_clean_interval [356,15390) [v2:10.8.128.214:6824/1267294927,v1:10.8.128.214:6825/1267294927] [v2:10.8.128.214:6826/1267294927,v1:10.8.128.214:6827/1267294927] exists,up 2e8bfd8a-c71f-4fea-9d89-de69f96d87a5 osd.3 up in weight 1 up_from 15305 up_thru 15491 down_at 15298 last_clean_interval [265,15297) [v2:10.8.128.212:6808/1674113188,v1:10.8.128.212:6809/1674113188] [v2:10.8.128.212:6842/1674113188,v1:10.8.128.212:6843/1674113188] exists,up ae682a2e-7c22-40c1-b0ca-755a30af6e8b osd.4 up in weight 1 up_from 15364 up_thru 15492 down_at 15356 last_clean_interval [320,15355) [v2:10.8.128.213:6856/2877006602,v1:10.8.128.213:6857/2877006602] [v2:10.8.128.213:6858/2877006602,v1:10.8.128.213:6859/2877006602] exists,up 19ec550a-4b5a-40d3-a9c3-0683600bb820 osd.5 up in weight 1 up_from 15424 up_thru 15491 down_at 15418 last_clean_interval [12572,15417) [v2:10.8.128.217:6856/3439258045,v1:10.8.128.217:6857/3439258045] [v2:10.8.128.217:6858/3439258045,v1:10.8.128.217:6859/3439258045] exists,up 014ff108-8c85-46c6-9f6e-0c720f84eeac osd.6 up in weight 1 up_from 15334 up_thru 15487 down_at 15328 last_clean_interval [296,15327) [v2:10.8.128.212:6864/2352429355,v1:10.8.128.212:6865/2352429355] [v2:10.8.128.212:6866/2352429355,v1:10.8.128.212:6867/2352429355] exists,up ece9252a-ef39-4af6-8b47-1d2088971c65 osd.7 up in weight 1 up_from 15389 up_thru 15494 down_at 15381 last_clean_interval [347,15380) [v2:10.8.128.214:6864/2436227132,v1:10.8.128.214:6865/2436227132] [v2:10.8.128.214:6866/2436227132,v1:10.8.128.214:6867/2436227132] exists,up 6ba5e8d2-72ed-4767-9b82-846ca0c8e3b3 osd.8 up in weight 1 up_from 15372 up_thru 15487 down_at 15369 last_clean_interval [333,15368) [v2:10.8.128.213:6864/2077144181,v1:10.8.128.213:6865/2077144181] [v2:10.8.128.213:6866/2077144181,v1:10.8.128.213:6867/2077144181] exists,up 7c5978b0-7d50-42d0-8c6d-8ea0d05c106d osd.9 up in weight 1 up_from 15442 up_thru 15491 down_at 15435 last_clean_interval [12576,15434) [v2:10.8.128.217:6864/1859588829,v1:10.8.128.217:6865/1859588829] [v2:10.8.128.217:6866/1859588829,v1:10.8.128.217:6867/1859588829] exists,up 1d9aabdb-8286-44c3-8fc4-e1614e1bafdc osd.10 up in weight 1 up_from 15301 up_thru 15487 down_at 15296 last_clean_interval [260,15295) [v2:10.8.128.212:6800/152068372,v1:10.8.128.212:6801/152068372] [v2:10.8.128.212:6802/152068372,v1:10.8.128.212:6803/152068372] exists,up fad0e532-cf28-4380-b7d7-badecc7a5507 osd.11 up in weight 1 up_from 15411 up_thru 15492 down_at 15402 last_clean_interval [369,15401) [v2:10.8.128.214:6800/1821456158,v1:10.8.128.214:6801/1821456158] [v2:10.8.128.214:6802/1821456158,v1:10.8.128.214:6803/1821456158] exists,up b51fc3b0-bcb5-459d-add5-fbd3ef38e996 osd.12 up in weight 1 up_from 15339 up_thru 15492 down_at 15336 last_clean_interval [308,15335) [v2:10.8.128.213:6808/3030502462,v1:10.8.128.213:6809/3030502462] [v2:10.8.128.213:6810/3030502462,v1:10.8.128.213:6811/3030502462] exists,up e4fbca6f-2cfd-4adf-aa76-934f1ee9b4de osd.13 down out weight 0 up_from 15433 up_thru 15433 down_at 15456 last_clean_interval [12543,15423) [v2:10.8.128.217:6808/4066800377,v1:10.8.128.217:6809/4066800377] [v2:10.8.128.217:6810/4066800377,v1:10.8.128.217:6811/4066800377] autoout,exists 127b3545-1c25-4843-b8c0-58129b975744 osd.14 up in weight 1 up_from 15384 up_thru 15492 down_at 15376 last_clean_interval [342,15375) [v2:10.8.128.214:6808/1415681408,v1:10.8.128.214:6809/1415681408] [v2:10.8.128.214:6810/1415681408,v1:10.8.128.214:6811/1415681408] exists,up 59a6f548-842e-4ebb-9436-5a115b7a51a2 osd.15 up in weight 1 up_from 15322 up_thru 15492 down_at 15316 last_clean_interval [283,15315) [v2:10.8.128.212:6810/2593498333,v1:10.8.128.212:6811/2593498333] [v2:10.8.128.212:6812/2593498333,v1:10.8.128.212:6813/2593498333] exists,up 7c1d296d-f101-4419-a5dd-8c82d43497cf osd.16 up in weight 1 up_from 15361 up_thru 15492 down_at 15348 last_clean_interval [315,15347) [v2:10.8.128.213:6816/2615629254,v1:10.8.128.213:6817/2615629254] [v2:10.8.128.213:6818/2615629254,v1:10.8.128.213:6819/2615629254] exists,up fa0b85ef-5abd-47d0-bb8f-d24f1d293bb7 osd.17 down out weight 0 up_from 15445 up_thru 15445 down_at 15458 last_clean_interval [12547,15438) [v2:10.8.128.217:6816/1188944202,v1:10.8.128.217:6817/1188944202] [v2:10.8.128.217:6818/1188944202,v1:10.8.128.217:6819/1188944202] autoout,exists b80610ca-0ab1-4fa8-87fc-8901fbe0d51d osd.18 up in weight 1 up_from 15414 up_thru 15479 down_at 15407 last_clean_interval [380,15406) [v2:10.8.128.214:6816/2467598827,v1:10.8.128.214:6817/2467598827] [v2:10.8.128.214:6818/2467598827,v1:10.8.128.214:6819/2467598827] exists,up f109dde3-809a-4341-a4c3-ae76bc94f7a6 osd.19 up in weight 1 up_from 15309 up_thru 15493 down_at 15302 last_clean_interval [270,15301) [v2:10.8.128.212:6818/3366057300,v1:10.8.128.212:6819/3366057300] [v2:10.8.128.212:6820/3366057300,v1:10.8.128.212:6821/3366057300] exists,up 0289665b-cc42-48f7-a55f-bb673799c9ca osd.20 up in weight 1 up_from 15351 up_thru 15490 down_at 15341 last_clean_interval [307,15340) [v2:10.8.128.213:6824/876450911,v1:10.8.128.213:6825/876450911] [v2:10.8.128.213:6826/876450911,v1:10.8.128.213:6827/876450911] exists,up bfa91559-6140-444c-849e-2a65a404d3d1 osd.21 down out weight 0 up_from 15429 up_thru 15429 down_at 15460 last_clean_interval [12552,15420) [v2:10.8.128.217:6824/1099962364,v1:10.8.128.217:6825/1099962364] [v2:10.8.128.217:6826/1099962364,v1:10.8.128.217:6827/1099962364] autoout,exists 346f3adc-7505-44c4-9c09-704a6e5d7090 osd.22 up in weight 1 up_from 15408 up_thru 15492 down_at 15397 last_clean_interval [365,15396) [v2:10.8.128.214:6832/1443432180,v1:10.8.128.214:6833/1443432180] [v2:10.8.128.214:6834/1443432180,v1:10.8.128.214:6835/1443432180] exists,up 60b77f38-edd4-41d1-80d6-5c0ba4b20246 osd.23 up in weight 1 up_from 15314 up_thru 15491 down_at 15306 last_clean_interval [274,15305) [v2:10.8.128.212:6826/3174798235,v1:10.8.128.212:6827/3174798235] [v2:10.8.128.212:6828/3174798235,v1:10.8.128.212:6829/3174798235] exists,up a2e1020f-497b-42a0-9c69-424907dfc60f osd.24 up in weight 1 up_from 15367 up_thru 15492 down_at 15358 last_clean_interval [325,15357) [v2:10.8.128.213:6832/872631187,v1:10.8.128.213:6833/872631187] [v2:10.8.128.213:6834/872631187,v1:10.8.128.213:6835/872631187] exists,up 7dccce66-a574-4622-8866-fa6a68648c45 osd.25 down out weight 0 up_from 15439 up_thru 15439 down_at 15462 last_clean_interval [12557,15430) [v2:10.8.128.217:6832/1872376923,v1:10.8.128.217:6833/1872376923] [v2:10.8.128.217:6834/1872376923,v1:10.8.128.217:6835/1872376923] autoout,exists d3a9bd68-adf9-4094-b621-aa864475b2d0 osd.26 up in weight 1 up_from 15379 up_thru 15470 down_at 15374 last_clean_interval [337,15373) [v2:10.8.128.214:6840/4160146302,v1:10.8.128.214:6841/4160146302] [v2:10.8.128.214:6842/4160146302,v1:10.8.128.214:6843/4160146302] exists,up 5149280c-7369-41d8-ae6a-7bd5c816dacb osd.27 up in weight 1 up_from 15327 up_thru 15491 down_at 15319 last_clean_interval [288,15318) [v2:10.8.128.212:6834/89520191,v1:10.8.128.212:6835/89520191] [v2:10.8.128.212:6836/89520191,v1:10.8.128.212:6837/89520191] exists,up 5b4ce98c-03d1-4d59-9945-801f82b80ce0 osd.28 up in weight 1 up_from 15354 up_thru 15492 down_at 15343 last_clean_interval [311,15342) [v2:10.8.128.213:6840/2973682997,v1:10.8.128.213:6841/2973682997] [v2:10.8.128.213:6842/2973682997,v1:10.8.128.213:6843/2973682997] exists,up 8bc5f3c5-4767-4944-8084-b3f26c2d5ef6 osd.29 up in weight 1 up_from 15436 up_thru 15491 down_at 15426 last_clean_interval [12561,15425) [v2:10.8.128.217:6840/3417836056,v1:10.8.128.217:6841/3417836056] [v2:10.8.128.217:6842/3417836056,v1:10.8.128.217:6843/3417836056] exists,up 5f589ff9-1f75-4f61-a626-5bfced1bcb99 osd.30 up in weight 1 up_from 15405 up_thru 15492 down_at 15393 last_clean_interval [360,15392) [v2:10.8.128.214:6848/2405030462,v1:10.8.128.214:6849/2405030462] [v2:10.8.128.214:6850/2405030462,v1:10.8.128.214:6851/2405030462] exists,up d9a2b3fa-adce-44a7-8157-39dcb9ada8fb osd.31 up in weight 1 up_from 15331 up_thru 15493 down_at 15324 last_clean_interval [292,15323) [v2:10.8.128.212:6850/2562535729,v1:10.8.128.212:6851/2562535729] [v2:10.8.128.212:6852/2562535729,v1:10.8.128.212:6853/2562535729] exists,up ca0f750a-ba77-4107-ba10-ba6c389d5fc0 osd.32 up in weight 1 up_from 15346 up_thru 15492 down_at 15338 last_clean_interval [304,15337) [v2:10.8.128.213:6848/1342419880,v1:10.8.128.213:6849/1342419880] [v2:10.8.128.213:6850/1342419880,v1:10.8.128.213:6851/1342419880] exists,up 21287e7b-9429-472a-bdcb-9dc5e7aa3f47 osd.33 up in weight 1 up_from 15448 up_thru 15491 down_at 15441 last_clean_interval [12567,15440) [v2:10.8.128.217:6848/468425815,v1:10.8.128.217:6849/468425815] [v2:10.8.128.217:6850/468425815,v1:10.8.128.217:6851/468425815] exists,up d640f7a8-5ece-4942-b436-41d3446b4583 osd.34 up in weight 1 up_from 15395 up_thru 15480 down_at 15386 last_clean_interval [352,15385) [v2:10.8.128.214:6856/13339455,v1:10.8.128.214:6857/13339455] [v2:10.8.128.214:6858/13339455,v1:10.8.128.214:6859/13339455] exists,up 5a6db3ff-d874-4685-9563-d71746061363 osd.35 up in weight 1 up_from 15319 up_thru 15491 down_at 15311 last_clean_interval [562,15310) [v2:10.8.128.212:6848/2471831620,v1:10.8.128.212:6849/2471831620] [v2:10.8.128.212:6858/2471831620,v1:10.8.128.212:6859/2471831620] exists,up 5c883920-b9bb-44e7-bec5-8ca258d29722 How reproducible: 1/1 Steps to Reproduce: 1. Deploy RHCS cluster, with 4 hosts, and 9 OSDs per host. 2. Fill the cluster to around 45%, and keep IOs going. 3. Bring down few OSDs ( 5 ) down on one host. After 10 minutes, the down OSDs would be marked "OUT" 4. Observe that Once they are marked out, the PGs remap onto other OSDs on the host. 5. Once remap has started, we are observing PGs stuck in backfill_toofull. but No OSDs have crossed backfill_tofull ratio. Actual results: PGs are wrongly marked backfill_toofull Expected results: No PGs stuck in backfill_toofull state until OSDs actually cross the backfill_toofull ratio Additional info: Issue looks similar to tracker : https://tracker.ceph.com/issues/62248