Bug 2348970

Summary: [Read Balancer] pg_upmap_primary items are retained in OSD map for a pool which is already deleted
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Laura Flores <lflores>
Component: RADOSAssignee: Laura Flores <lflores>
Status: CLOSED ERRATA QA Contact: skanta
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.1CC: bhkaur, bhubbard, ceph-eng-bugs, cephqe-warriors, nojha, skanta, tserlin, vumrao, yhatuka
Target Milestone: ---   
Target Release: 7.1z4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-18.2.1-300.el9cp Doc Type: Bug Fix
Doc Text:
.Phantom pg_upmap_primary mappings no longer persist after pool removal Previously, when a pool was deleted, pg_upmap_primary mappings linked to that pool remained in the OSDMap. These mappings could not be manually removed because the pool and its PG IDs no longer existed. With this fix, deleting a pool now automatically removes its associated pg_upmap_primary mappings from the OSDMap, keeping the cluster metadata clean.
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-05-07 12:48:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Laura Flores 2025-02-28 02:02:55 UTC
Description copied from https://bugzilla.redhat.com/show_bug.cgi?id=2293847, the 8.x BZ. This is a BZ to ensure the same fix gets into 7.x.

Description of problem:
We are observing that the pg_upmap_primary items for a pool is retained and present in the OSDmap, even after the pool is deleted from the cluster.

These entries should be deleted once the pool related to the entries is deleted.

Note that when we try to revert the changes made by pg_upmap_primary, on a non existent pools, it correctly identifies as the PG does not exist, but still, the entry is maintained in Osdmap

[root@ceph-pdhiran-1-wmhks0-node7 tmp]# ceph osd rm-pg-upmap-primary 12.4a
Error ENOENT: pgid '12.4a' does not exist

Note that "pg_upmap_items" are deleted for a deleted pool, but not "pg_upmap_primary" items.


In the below o/p, we can see that the pg_upmap_primary items are retained for non-existent pools.
# ceph osd dump
epoch 279
fsid 033375e6-317d-11ef-beea-fa163ed4d062
created 2024-06-23T16:24:28.065925+0000
modified 2024-06-23T17:55:42.041372+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 14
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client reef
min_compat_client luminous
require_osd_release reef
stretch_mode_enabled false
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 28 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 15.00
pool 2 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 94 lfor 0/0/44 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.87
pool 3 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode on last_change 71 lfor 0/0/69 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.03
pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 48 lfor 0/0/44 flags hashpspool stripe_width 0 application rgw read_balance_score 1.87
pool 5 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 125 lfor 0/0/46 flags hashpspool stripe_width 0 application rgw read_balance_score 1.87
pool 6 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 63 lfor 0/0/53 flags hashpspool stripe_width 0 application rgw read_balance_score 1.41
pool 7 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 63 lfor 0/0/55 flags hashpspool stripe_width 0 pg_autoscale_bias 4 application rgw read_balance_score 1.87
max_osd 15
osd.0 up   in  weight 1 up_from 23 up_thru 271 down_at 0 last_clean_interval [0,0) [v2:10.0.209.227:6800/1399781676,v1:10.0.209.227:6801/1399781676] [v2:10.0.209.227:6802/1399781676,v1:10.0.209.227:6803/1399781676] exists,up 12ce6153-e72d-40b0-822a-8d9f0411c4be
osd.1 up   in  weight 1 up_from 23 up_thru 272 down_at 0 last_clean_interval [0,0) [v2:10.0.210.119:6800/1946035796,v1:10.0.210.119:6801/1946035796] [v2:10.0.210.119:6802/1946035796,v1:10.0.210.119:6803/1946035796] exists,up 4f8fb565-d36a-4dd0-9522-601ee9b31af7
osd.2 up   in  weight 1 up_from 26 up_thru 265 down_at 0 last_clean_interval [0,0) [v2:10.0.208.110:6808/4002311913,v1:10.0.208.110:6809/4002311913] [v2:10.0.208.110:6810/4002311913,v1:10.0.208.110:6811/4002311913] exists,up e8a730ea-07a8-4034-a156-bf3e08995830
osd.3 up   in  weight 1 up_from 32 up_thru 276 down_at 0 last_clean_interval [0,0) [v2:10.0.209.227:6824/2193644713,v1:10.0.209.227:6825/2193644713] [v2:10.0.209.227:6826/2193644713,v1:10.0.209.227:6827/2193644713] exists,up 6b178aee-32d0-4993-9d62-06bcc81c91fe
osd.4 up   in  weight 1 up_from 31 up_thru 275 down_at 0 last_clean_interval [0,0) [v2:10.0.210.119:6824/1405105770,v1:10.0.210.119:6825/1405105770] [v2:10.0.210.119:6826/1405105770,v1:10.0.210.119:6827/1405105770] exists,up 076861b2-ecd3-4341-be89-3ee887881956
osd.5 up   in  weight 1 up_from 28 up_thru 274 down_at 0 last_clean_interval [0,0) [v2:10.0.208.110:6816/776345304,v1:10.0.208.110:6817/776345304] [v2:10.0.208.110:6818/776345304,v1:10.0.208.110:6819/776345304] exists,up 0fbb0230-8f5f-46c5-a25f-5a844b185224
osd.6 up   in  weight 1 up_from 33 up_thru 267 down_at 0 last_clean_interval [0,0) [v2:10.0.209.227:6832/3523560135,v1:10.0.209.227:6833/3523560135] [v2:10.0.209.227:6834/3523560135,v1:10.0.209.227:6835/3523560135] exists,up 07e7cf54-562c-4777-a7d4-6edd2d9e9696
osd.7 up   in  weight 1 up_from 31 up_thru 221 down_at 0 last_clean_interval [0,0) [v2:10.0.208.110:6824/2201719947,v1:10.0.208.110:6825/2201719947] [v2:10.0.208.110:6826/2201719947,v1:10.0.208.110:6827/2201719947] exists,up 58b7d504-9f91-4b0c-b92e-56bb0dd1dd59
osd.8 up   in  weight 1 up_from 33 up_thru 258 down_at 0 last_clean_interval [0,0) [v2:10.0.210.119:6832/888279906,v1:10.0.210.119:6833/888279906] [v2:10.0.210.119:6834/888279906,v1:10.0.210.119:6835/888279906] exists,up 717a87f0-6c8d-447d-92c6-2cd7c7261411
osd.9 up   in  weight 1 up_from 33 up_thru 261 down_at 0 last_clean_interval [0,0) [v2:10.0.208.110:6832/4153365315,v1:10.0.208.110:6833/4153365315] [v2:10.0.208.110:6834/4153365315,v1:10.0.208.110:6835/4153365315] exists,up 3792d52d-8e4d-4c34-b89b-e0deec320b3e
osd.10 up   in  weight 1 up_from 26 up_thru 234 down_at 0 last_clean_interval [0,0) [v2:10.0.209.227:6808/2865974060,v1:10.0.209.227:6809/2865974060] [v2:10.0.209.227:6810/2865974060,v1:10.0.209.227:6811/2865974060] exists,up 776b1173-7ea3-447f-9b71-4634a311d55f
osd.11 up   in  weight 1 up_from 26 up_thru 273 down_at 0 last_clean_interval [0,0) [v2:10.0.210.119:6808/28677353,v1:10.0.210.119:6809/28677353] [v2:10.0.210.119:6810/28677353,v1:10.0.210.119:6811/28677353] exists,up 2b06828e-b448-41c8-8ad9-7720d844fbdd
osd.12 up   in  weight 1 up_from 24 up_thru 239 down_at 0 last_clean_interval [0,0) [v2:10.0.208.110:6800/678421860,v1:10.0.208.110:6801/678421860] [v2:10.0.208.110:6802/678421860,v1:10.0.208.110:6803/678421860] exists,up bdeaf932-1556-48b8-91de-507293a9403a
osd.13 up   in  weight 1 up_from 29 up_thru 268 down_at 0 last_clean_interval [0,0) [v2:10.0.209.227:6816/1113828473,v1:10.0.209.227:6817/1113828473] [v2:10.0.209.227:6818/1113828473,v1:10.0.209.227:6819/1113828473] exists,up 8721aa33-ed78-40ff-bb89-9c2c585d14c3
osd.14 up   in  weight 1 up_from 29 up_thru 257 down_at 0 last_clean_interval [0,0) [v2:10.0.210.119:6816/3349782877,v1:10.0.210.119:6817/3349782877] [v2:10.0.210.119:6818/3349782877,v1:10.0.210.119:6819/3349782877] exists,up 0a6b5ca6-c6df-4b68-9d94-a711ccf55e10
pg_upmap_items 3.a [6,10]
pg_upmap_items 3.c [1,14]
pg_upmap_items 3.10 [1,14]
pg_upmap_items 3.12 [9,2]
pg_upmap_items 3.1e [5,2]
pg_upmap_items 3.21 [12,2]
pg_upmap_items 3.37 [11,4]
pg_upmap_items 3.56 [1,14]
pg_upmap_items 3.5a [9,2]
pg_upmap_items 3.61 [12,2]
pg_upmap_items 3.66 [11,4]
pg_upmap_items 3.6f [12,2]
pg_upmap_items 3.79 [11,4]
pg_upmap_items 3.87 [1,4]
pg_upmap_items 3.96 [11,8]
pg_upmap_items 3.ae [11,4]
pg_upmap_items 3.c8 [12,2]
pg_upmap_items 3.de [1,14]
pg_upmap_items 3.e3 [5,2]
pg_upmap_items 3.e6 [12,2]
pg_upmap_items 3.e9 [1,4]
pg_upmap_items 3.105 [1,14]
pg_upmap_items 3.119 [1,4]
pg_upmap_items 3.12f [11,4]
pg_upmap_items 3.166 [11,14]
pg_upmap_items 3.173 [11,4]
pg_upmap_items 3.175 [1,14]
pg_upmap_items 3.17a [11,4]
pg_upmap_items 3.17d [1,8]
pg_upmap_items 3.196 [1,4]
pg_upmap_items 3.1e3 [6,10]
pg_upmap_items 3.1f5 [12,2]
pg_upmap_items 3.1f9 [11,14]
pg_upmap_items 3.1fa [11,8]
pg_upmap_primary 2.4 13
pg_upmap_primary 2.5 0
pg_upmap_primary 2.8 7
pg_upmap_primary 2.c 0
pg_upmap_primary 3.2 8
pg_upmap_primary 3.3 4
pg_upmap_primary 3.5 5
pg_upmap_primary 3.9 9
pg_upmap_primary 3.a 8
pg_upmap_primary 3.d 1
pg_upmap_primary 3.f 2
pg_upmap_primary 3.13 1
pg_upmap_primary 3.14 2
pg_upmap_primary 3.15 9
pg_upmap_primary 3.1c 5
pg_upmap_primary 3.20 5
pg_upmap_primary 3.21 2
pg_upmap_primary 3.23 1
pg_upmap_primary 3.25 12
pg_upmap_primary 3.2a 4
pg_upmap_primary 3.2d 6
pg_upmap_primary 3.32 8
pg_upmap_primary 3.3a 4
pg_upmap_primary 3.3b 6
pg_upmap_primary 3.64 13
pg_upmap_primary 3.6b 12
pg_upmap_primary 4.5 9
pg_upmap_primary 4.6 9
pg_upmap_primary 4.8 3
pg_upmap_primary 4.c 1
pg_upmap_primary 4.12 14
pg_upmap_primary 4.13 9
pg_upmap_primary 4.17 3
pg_upmap_primary 5.0 10
pg_upmap_primary 5.1 7
pg_upmap_primary 5.2 11
pg_upmap_primary 5.a 10
pg_upmap_primary 5.c 4
pg_upmap_primary 5.e 12
pg_upmap_primary 5.f 12
pg_upmap_primary 5.18 11
pg_upmap_primary 6.3 14
pg_upmap_primary 6.9 0
pg_upmap_primary 6.b 1
pg_upmap_primary 6.c 8
pg_upmap_primary 6.10 8
pg_upmap_primary 6.12 10
pg_upmap_primary 7.0 1
pg_upmap_primary 7.2 13
pg_upmap_primary 7.4 3
pg_upmap_primary 7.7 12
pg_upmap_primary 7.b 11
pg_upmap_primary 7.c 11
pg_upmap_primary 7.10 13
pg_upmap_primary 7.11 0
pg_upmap_primary 7.12 8
pg_upmap_primary 7.14 9
pg_upmap_primary 7.18 0
pg_upmap_primary 7.19 14
pg_upmap_primary 11.0 0
pg_upmap_primary 11.1 11
pg_upmap_primary 11.5 8
pg_upmap_primary 11.6 1
pg_upmap_primary 11.9 0
pg_upmap_primary 11.10 8
pg_upmap_primary 11.12 9
pg_upmap_primary 11.14 14
pg_upmap_primary 11.1a 14
pg_upmap_primary 11.1b 8
pg_upmap_primary 11.21 2
pg_upmap_primary 11.25 9
pg_upmap_primary 11.26 9
pg_upmap_primary 11.27 13
pg_upmap_primary 11.32 0
pg_upmap_primary 12.3 2
pg_upmap_primary 12.6 3
pg_upmap_primary 12.8 6
pg_upmap_primary 12.c 13
pg_upmap_primary 12.11 5
pg_upmap_primary 12.19 0
pg_upmap_primary 12.20 0
pg_upmap_primary 12.2a 1
pg_upmap_primary 12.34 11
pg_upmap_primary 12.37 5
pg_upmap_primary 12.41 4
pg_upmap_primary 12.4a 3
blocklist 10.0.210.119:6841/4197514227 expires 2024-06-24T16:36:02.231887+0000
blocklist 10.0.210.119:6840/4197514227 expires 2024-06-24T16:36:02.231887+0000
blocklist 10.0.208.48:0/642986218 expires 2024-06-24T16:26:18.070807+0000
blocklist 10.0.208.48:6801/2453512167 expires 2024-06-24T16:26:18.070807+0000
blocklist 10.0.208.48:0/2187556112 expires 2024-06-24T16:26:18.070807+0000
blocklist 10.0.208.48:6801/2480930094 expires 2024-06-24T16:25:13.936268+0000
blocklist 10.0.208.48:6800/2480930094 expires 2024-06-24T16:25:13.936268+0000
blocklist 10.0.208.48:0/962076823 expires 2024-06-24T16:26:18.070807+0000
blocklist 10.0.208.48:0/1556879927 expires 2024-06-24T16:25:13.936268+0000
blocklist 10.0.208.48:6800/2856420942 expires 2024-06-24T16:24:50.400801+0000
blocklist 10.0.208.48:6801/2856420942 expires 2024-06-24T16:24:50.400801+0000
blocklist 10.0.208.48:0/1285066124 expires 2024-06-24T16:24:50.400801+0000
blocklist 10.0.208.48:0/2645603862 expires 2024-06-24T16:24:50.400801+0000
blocklist 10.0.208.48:0/2122667141 expires 2024-06-24T16:25:13.936268+0000
blocklist 10.0.208.48:6800/2453512167 expires 2024-06-24T16:26:18.070807+0000
blocklist 10.0.208.48:0/174473368 expires 2024-06-24T16:24:50.400801+0000
blocklist 10.0.208.48:0/933564633 expires 2024-06-24T16:25:13.936268+0000
[root@ceph-pdhiran-1-wmhks0-node7 tmp]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    375 GiB  367 GiB  8.2 GiB   8.2 GiB       2.19
TOTAL  375 GiB  367 GiB  8.2 GiB   8.2 GiB       2.19

--- POOLS ---
POOL                 ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr                  1    1  598 KiB        2  1.8 MiB      0    116 GiB
cephfs.cephfs.meta    2   16  2.3 KiB       22   96 KiB      0    116 GiB
cephfs.cephfs.data    3  512      0 B        0      0 B      0    116 GiB
.rgw.root             4   32  2.6 KiB        6   72 KiB      0    116 GiB
default.rgw.log       5   32  3.6 KiB      177  408 KiB      0    116 GiB
default.rgw.control   6   32      0 B        8      0 B      0    116 GiB
default.rgw.meta      7   32    382 B        2   24 KiB      0    116 GiB


Version-Release number of selected component (if applicable):
# ceph version
ceph version 18.2.1-195.el9cp (70ed561fd4cdf71e96b35c5289c40a14e1bbc8a0) reef (stable)

How reproducible:
Always

Steps to Reproduce:
1. Deploy RHCS cluster, create few pools
2. Run the offline osdmaptool to perform reads balancing on the pools. "pg_upmap_primary" entries are created for changes. same can be verified via "ceph osd dump"
3. Delete the pools. Observe that the "pg_upmap_primary" entries are not removed

Actual results:
pg_upmap_primary entries are present for a non existent pool

Expected results:
pg_upmap_primary entries should only be present for a pool that is present on the cluster

Additional info:

Comment 7 errata-xmlrpc 2025-05-07 12:48:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 7.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:4664