Bug 2305677

Summary: [7.1 backport] [CEE]Ceph mgr crashed after a mgr failover with the message mgr operator() Failed to run module in active mode ('cephadm')
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Varuni Sawant <vasawant>
Component: CephadmAssignee: Adam King <adking>
Status: CLOSED ERRATA QA Contact: skanta
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.1CC: aakobi, adking, allee, bkunal, cephqe-warriors, dwalveka, gabrioux, idavid, ksachdev, ngangadh, pdhange, saraut, skanta, tasano, tserlin
Target Milestone: ---Flags: gabrioux: needinfo? (allee)
adking: needinfo? (vasawant)
adking: needinfo? (allee)
Target Release: 7.1z2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-18.2.1-235.el9cp Doc Type: Bug Fix
Doc Text:
Previously, cephadm osd removal queue did not have a parameter for original_weight. As a result, the cephadm module would crash during OSD removal. With this fix, the original_weight field is added as an attribute for the osd removal queue and the cephadm no longer crashes during OSD removal.
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-07 14:39:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2305678    
Bug Blocks:    

Description Varuni Sawant 2024-08-19 09:54:42 UTC
Description of problem:

After host removal, stray host and stray daemons warning reported for the removed host. At the same time OSDs from another host were being drained. To mitigate the stray host warning a mgr failover was performed, however the mgr crashed with the error message:

mgr operator() Failed to run module in active mode ('cephadm')

Version-Release number of selected component (if applicable):
Red Hat Ceph Storage 7.1 - 7.1 (18.2.1-194.el9cp)

Comment 49 errata-xmlrpc 2024-11-07 14:39:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:9010