Description of problem: Scenario is pretty much similar to Red HatBZ#1560802 If One or more OSDs are FULL, IOs are expected to fail for all the pools which have any one these OSDs in their acting set, however, any pool which does not contain any of the FULL OSDs in its acting set should be able to serve IOs and should NOT be marked as 'full (no space)' in health warning. Version-Release number of selected component (if applicable): ceph Version 17.2.6-70.el9cp quincy (stable) ceph version 16.2.10-175.el8cp (8c714c8184d123e241e34f9c0f6abcc1d1858e1c) pacific (stable) How reproducible: 5/5 Steps to Reproduce: 1. Create a replicated pool with single PG for sake of convenience 2. Fetch acting set for the created pool (say, these OSDs are named x, y, z) 3. Re-weight the OSDs(x,y,z) part of the acting set to 0 to prevent inclusion during creation of subsequent pools 4. Create another single PG replicated pool and a single PG EC pool (single PG is not necessary, just for convenience) 5. Re-weight the 0 weight OSDs(x,y,z) to 1 6. Fetch acting set of new replicated and EC pool, ensure these acting sets do not have any of the OSDs which are part of the first replicated pool 7. Either decrease the nearfill-full, backfill-full, full ratios to 0.6, 0.6, and 0.7 respectively or continue with standard values of 0.85, 0.9, and 0.95 8. Write data to replicated pool 1 using any tool till full capacity is reached. 9. Once OSDs(x,y,z) are full, cluster health warning will have warning about these OSDs being full along with every pool being full including the ones that do not have these OSDs in their acting set Actual results: All the pools in the cluster are marked full when one or more OSDs are full Expected results: Pools having full OSDs should not be able to serve/accept I/Os and should be marked as full but Pools which do not have full OSDs should not be marked full and should be able to serve I/O Additional info: cephadm -v shell -- ceph osd pool create pool_full_osds 1 1 cephadm -v shell -- sudo ceph osd pool application enable pool_full_osds rados cephadm -v shell -- ceph osd pool set pool_full_osds pg_autoscale_mode off cephadm -v shell -- ceph pg map 21.0 -f json cephadm -v shell -- ceph osd reweight osd.9 0 cephadm -v shell -- ceph osd reweight osd.2 0 cephadm -v shell -- ceph osd reweight osd.4 0 cephadm -v shell -- ceph osd pool create re_pool_test 1 1 cephadm -v shell -- sudo ceph osd pool application enable re_pool_test rados cephadm -v shell -- ceph osd pool set re_pool_test pg_autoscale_mode off cephadm -v shell -- ceph osd dump -f json cephadm -v shell -- ceph pg map 22.0 -f json cephadm -v shell -- ceph osd erasure-code-profile set ecprofile_test_ec_pool crush-failure-domain=osd k=4 m=2 plugin=jerasure cephadm -v shell -- ceph osd pool create test_ec_pool 1 1 erasure ecprofile_test_ec_pool cephadm -v shell -- sudo ceph osd pool application enable test_ec_pool rados cephadm -v shell -- ceph osd pool set test_ec_pool pg_autoscale_mode off cephadm -v shell -- ceph pg map 23.0 -f json Acting set of Pool 1: [9, 2, 4] | Acting set of Pool 2: [3, 7, 5] | Acting set of Pool 3: [10, 8, 14, 6, 12, 1] cephadm -v shell -- ceph osd set-nearfull-ratio 0.65 cephadm -v shell -- ceph osd set-backfillfull-ratio 0.69 cephadm -v shell -- ceph osd set-full-ratio 0.70 cephadm -v shell -- ceph osd reweight osd.9 1 cephadm -v shell -- ceph osd reweight osd.2 1 cephadm -v shell -- ceph osd reweight osd.4 1 rados bench -p pool_full_osds 200 write -b 16384KB --no-cleanup --max-objects 1120 on 10.0.208.231 timeout 600 cephadm -v shell -- ceph health detail HEALTH_ERR 3 full osd(s); 10 pool(s) full [ERR] OSD_FULL: 3 full osd(s) osd.2 is full osd.4 is full osd.9 is full [WRN] POOL_FULL: 10 pool(s) full pool 'device_health_metrics' is full (no space) pool 'cephfs.cephfs.meta' is full (no space) pool 'cephfs.cephfs.data' is full (no space) pool '.rgw.root' is full (no space) pool 'default.rgw.log' is full (no space) pool 'default.rgw.control' is full (no space) pool 'default.rgw.meta' is full (no space) pool 'pool_full_osds' is full (no space) pool 're_pool_test' is full (no space) pool 'test_ec_pool' is full (no space)
@Radosla