Bug 2247531 - [rbd_support] fix hangs and mgr crash when rbd_support module tries to recover from repeated blocklisting
Summary: [rbd_support] fix hangs and mgr crash when rbd_support module tries to recove...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD-Mirror
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 7.1
Assignee: Ram Raja
QA Contact: Sunil Angadi
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 2247543 2267614 2298578 2298579
TreeView+ depends on / blocked
 
Reported: 2023-11-01 20:35 UTC by Ram Raja
Modified: 2024-07-18 07:59 UTC (History)
6 users (show)

Fixed In Version: ceph-18.2.1-2.el9cp
Doc Type: Bug Fix
Doc Text:
.`rbd_support` module no longer fails to recover from repeated blocklisting of its client Previously, it was observed that the `rbd_support` module failed to recover from repeated blocklisting of its client due to a recursive deadlock in the rbd_support module, a race condition in the rbd_support module's librbd client, and a bug in the librbd cython bindings that sometimes crashed the ceph-mgr. With this release, all these 3 issues are fixed and rbd_support` module no longer fails to recover from repeated blocklisting of its client
Clone Of:
: 2247543 (view as bug list)
Environment:
Last Closed: 2024-06-13 14:22:47 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 62891 0 None None None 2023-11-01 20:35:43 UTC
Ceph Project Bug Tracker 62994 0 None None None 2023-11-01 20:35:43 UTC
Ceph Project Bug Tracker 63009 0 None None None 2023-11-01 20:35:43 UTC
Ceph Project Bug Tracker 63028 0 None None None 2023-11-01 20:35:43 UTC
Red Hat Issue Tracker RHCEPH-7841 0 None None None 2023-11-01 20:36:03 UTC
Red Hat Product Errata RHSA-2024:3925 0 None None None 2024-06-13 14:22:57 UTC

Description Ram Raja 2023-11-01 20:35:44 UTC
Tested the recovery of the rbd_support module on repeated blocklisting of its client (https://bugzilla.redhat.com/show_bug.cgi?id=2111364) using the integration test in https://tracker.ceph.com/issues/62891 . Over tens of runs of the integration test (in upstream Ceph teuthology infra), ~40% of the runs failed. Observed hangs and ceph-mgr crashes when the rbd_support module tried to recover. Note that a hang during recovery was fixed earlier in https://bugzilla.redhat.com/show_bug.cgi?id=2211290 . The remaining hangs and ceph-mgr crash were root caused in
https://tracker.ceph.com/issues/62994
https://tracker.ceph.com/issues/63009
https://tracker.ceph.com/issues/63028

Comment 1 RHEL Program Management 2023-11-01 20:35:55 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 9 errata-xmlrpc 2024-06-13 14:22:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:3925


Note You need to log in before you can comment on or make changes to this bug.