Bug 2247531

Summary: [rbd_support] fix hangs and mgr crash when rbd_support module tries to recover from repeated blocklisting
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ram Raja <rraja>
Component: RBD-MirrorAssignee: Ram Raja <rraja>
Status: CLOSED ERRATA QA Contact: Sunil Angadi <sangadi>
Severity: high Docs Contact: Akash Raj <akraj>
Priority: unspecified    
Version: 7.0CC: akraj, ceph-eng-bugs, cephqe-warriors, idryomov, sangadi, tserlin
Target Milestone: ---   
Target Release: 7.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-18.2.1-2.el9cp Doc Type: Bug Fix
Doc Text:
.`rbd_support` module no longer fails to recover from repeated blocklisting of its client Previously, it was observed that the `rbd_support` module failed to recover from repeated blocklisting of its client due to a recursive deadlock in the rbd_support module, a race condition in the rbd_support module's librbd client, and a bug in the librbd cython bindings that sometimes crashed the ceph-mgr. With this release, all these 3 issues are fixed and rbd_support` module no longer fails to recover from repeated blocklisting of its client
Story Points: ---
Clone Of:
: 2247543 (view as bug list) Environment:
Last Closed: 2024-06-13 14:22:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2247543, 2267614, 2298578, 2298579    

Description Ram Raja 2023-11-01 20:35:44 UTC
Tested the recovery of the rbd_support module on repeated blocklisting of its client (https://bugzilla.redhat.com/show_bug.cgi?id=2111364) using the integration test in https://tracker.ceph.com/issues/62891 . Over tens of runs of the integration test (in upstream Ceph teuthology infra), ~40% of the runs failed. Observed hangs and ceph-mgr crashes when the rbd_support module tried to recover. Note that a hang during recovery was fixed earlier in https://bugzilla.redhat.com/show_bug.cgi?id=2211290 . The remaining hangs and ceph-mgr crash were root caused in
https://tracker.ceph.com/issues/62994
https://tracker.ceph.com/issues/63009
https://tracker.ceph.com/issues/63028

Comment 1 RHEL Program Management 2023-11-01 20:35:55 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 9 errata-xmlrpc 2024-06-13 14:22:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:3925