Bug 2247211

Summary: OSD crush weights are not added back to osd after osd rm and osd rm stop commands
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Pawan <pdhiran>
Component: CephadmAssignee: Adam King <adking>
Status: ASSIGNED --- QA Contact: Pawan <pdhiran>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: adking, bhubbard, ceph-eng-bugs, cephqe-warriors, jolmomar, ngangadh, nojha, pdhange, saraut, tserlin, vereddy, vumrao
Target Milestone: ---Flags: adking: needinfo? (akraj)
Target Release: 7.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-18.2.1-61.el9cp Doc Type: Known Issue
Doc Text:
.Cephadm does not maintain the previous OSD weight when draining an OSD Cephadm does not maintain the previous OSD weight when draining an OSD. Due to this, if the `ceph orch osd rm <osd-id>` command is run and later, the OSD removal is stopped, Cephadm will not set the crush weight of the OSD back to its original value. The crush weight will remain at 0. As a workaround, users have to manually adjust the crush weight of the OSD to its original value, or complete removal of the OSD and deploy a new one. Users should be careful when cancelling a `ceph orch osd rm` operation, as the crush weight of the OSD will not be returned to its original value before the removal process begins.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2267614, 2298578, 2298579    

Description Pawan 2023-10-31 10:12:41 UTC
Description of problem:
When a OSD removal is stopped when PGs are draining from them, The OSD crush weight is not added back to the OSD, because of which there would be no PGs present on that OSD, even though it is up and in the cluster.

This needs to be changed, So that when osd removal is stopped, the OSD weight gets added back and PGs once again move to that OSD.


Version-Release number of selected component (if applicable):
ceph version 17.2.6-148.el9cp (badc1d27cb07762bea48f6554ad4f92b9d3fbb6b) quincy (stable)

How reproducible:
Always

Steps to Reproduce:
1. Deploy RHCS cluster with stretch mode enabled.
2. Start OSD removal for a OSD.
OSD tree before removal:
# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME              STATUS  REWEIGHT  PRI-AFF
-1         1.17218  root default
-3         0.58609      datacenter zone-b
-2         0.19536          host osd-0
 5    hdd  0.09769              osd.5          up   1.00000  1.00000
10    hdd  0.09769              osd.10         up   1.00000  1.00000
-4         0.19536          host osd-1
 3    hdd  0.09769              osd.3          up   1.00000  1.00000
 9    hdd  0.09769              osd.9          up   1.00000  1.00000
-5         0.19537          host osd-2
 1    hdd  0.09769              osd.1          up   1.00000  1.00000
 7    hdd  0.09769              osd.7          up   1.00000  1.00000
-7         0.58609      datacenter zone-c
-6         0.19536          host osd-3
 2    hdd  0.09769              osd.2          up   1.00000  1.00000
11    hdd  0.09769              osd.11         up   1.00000  1.00000
-8         0.19536          host osd-4
 0    hdd  0.09769              osd.0          up   1.00000  1.00000
 6    hdd  0.09769              osd.6          up   1.00000  1.00000
-9         0.19537          host osd-5
 4    hdd  0.09769              osd.4          up   1.00000  1.00000
 8    hdd  0.09769              osd.8          up   1.00000  1.00000

osd.2 up   in  weight 1 up_from 1366 up_thru 1453 down_at 1348 last_clean_interval [30,1347) [v2:10.1.160.63:6800/1178508167,v1:10.1.160.63:6801/1178508167] [v2:10.1.160.63:6802/1178508167,v1:10.1.160.63:6803/1178508167] exists,up 43ecbe79-41dd-4ac3-a272-02cdc2ff88dc

remove the osd via below command : 

[root@osd-3 ~]# ceph orch osd rm 2 --force
Scheduled OSD(s) for removal.
VG/LV for the OSDs won't be zapped (--zap wasn't passed).
Run the `ceph-volume lvm zap` command with `--destroy` against the VG/LV if you want them to be destroyed.

# ceph orch osd rm status
OSD  HOST   STATE     PGS  REPLACE  FORCE  ZAP    DRAIN STARTED AT
2    osd-3  draining   70  False    True   False  2023-10-31 09:42:40.987291

3. when the OSD was draining, stop the removal.

[root@osd-3 ~]# ceph orch osd rm stop 2
Stopped OSD(s) removal

4. Observe that even though the removal is stopped, and the OSD is up and in the cluster, the crush weight is still 0, because of which there are no PGs on that OSD.

{
            "id": 2,
            "device_class": "hdd",
            "name": "osd.2",
            "type": "osd",
            "type_id": 0,
            "crush_weight": 0,
            "depth": 3,
            "pool_weights": {},
            "exists": 1,
            "status": "up",
            "reweight": 1,
            "primary_affinity": 1
        },

OSD_STAT  USED     AVAIL    USED_RAW  TOTAL    HB_PEERS                   PG_SUM  PRIMARY_PG_SUM
7          43 GiB   57 GiB    43 GiB  100 GiB    [0,2,3,4,5,6,8,9,10,11]     156              44
8          43 GiB   57 GiB    43 GiB  100 GiB    [0,1,2,3,5,6,7,9,10,11]     146              39
1          36 GiB   64 GiB    36 GiB  100 GiB    [0,2,3,4,5,6,8,9,10,11]     127              31
4          41 GiB   59 GiB    41 GiB  100 GiB    [0,1,3,5,6,7,8,9,10,11]     158              40
9          35 GiB   65 GiB    35 GiB  100 GiB  [0,1,2,3,4,5,6,7,8,10,11]     117              35
5          43 GiB   57 GiB    43 GiB  100 GiB  [0,1,2,3,4,6,7,8,9,10,11]     135              35
2         709 MiB   99 GiB   709 MiB  100 GiB    [0,1,3,4,6,7,8,9,10,11]       0               0
6          44 GiB   56 GiB    44 GiB  100 GiB    [0,1,3,4,5,7,8,9,10,11]     146              33
3          38 GiB   62 GiB    38 GiB  100 GiB  [0,1,2,4,5,6,7,8,9,10,11]     129              40
10         38 GiB   62 GiB    38 GiB  100 GiB   [0,1,2,3,4,5,6,7,8,9,11]     138              31
0          46 GiB   54 GiB    46 GiB  100 GiB    [1,2,3,4,5,7,8,9,10,11]     157              33
11         58 GiB   42 GiB    58 GiB  100 GiB     [0,1,3,4,5,6,7,8,9,10]     195              40

# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME              STATUS  REWEIGHT  PRI-AFF
-1         1.07451  root default
-3         0.58609      datacenter zone-b
-2         0.19536          host osd-0
 5    hdd  0.09769              osd.5          up   1.00000  1.00000
10    hdd  0.09769              osd.10         up   1.00000  1.00000
-4         0.19536          host osd-1
 3    hdd  0.09769              osd.3          up   1.00000  1.00000
 9    hdd  0.09769              osd.9          up   1.00000  1.00000
-5         0.19537          host osd-2
 1    hdd  0.09769              osd.1          up   1.00000  1.00000
 7    hdd  0.09769              osd.7          up   1.00000  1.00000
-7         0.48842      datacenter zone-c
-6         0.09769          host osd-3
 2    hdd        0              osd.2          up   1.00000  1.00000
11    hdd  0.09769              osd.11         up   1.00000  1.00000
-8         0.19536          host osd-4
 0    hdd  0.09769              osd.0          up   1.00000  1.00000
 6    hdd  0.09769              osd.6          up   1.00000  1.00000
-9         0.19537          host osd-5
 4    hdd  0.09769              osd.4          up   1.00000  1.00000
 8    hdd  0.09769              osd.8          up   1.00000  1.00000


Actual results:
after osd rm stop, the crush weight is not added back

Expected results:
if the osd rm is stopped, the crush weight which was made 0 should be reverted back

Additional info: