Description of problem:
From upstream tracker:
StandbyPyModule::get_config is using state.with_config without dropping the GIL around taking the lock.
The standby mgr process hangs without response, it is removed from mgrmap and does not retake active role when active mgr stops.
without MGR daemon, ceph reports 0 space, which has impact on OSP spawning new instances, as the available space is checked.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
standby mgr process stops responding, but process is still running - no msgs logged
standby mgr process is responding, if active mgr stops, one of stanbys mgr become active
See analysis in https://tracker.ceph.com/issues/35985
We are still waiting on thread dumps to confirm this issue is the same as https://tracker.ceph.com/issues/35985.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.