Bug 2237303
| Summary: | [rbd-mirror] : snapshot schedules stopped : possibly due to hang in MirrorSnapshotScheduleHandler.shutdown which could be in wait_for_pending() [7.0] | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Ilya Dryomov <idryomov> |
| Component: | RBD-Mirror | Assignee: | Ram Raja <rraja> |
| Status: | CLOSED ERRATA | QA Contact: | Sunil Angadi <sangadi> |
| Severity: | urgent | Docs Contact: | Rivka Pollack <rpollack> |
| Priority: | unspecified | ||
| Version: | 6.1 | CC: | akraj, amagrawa, ceph-eng-bugs, cephqe-warriors, ekuric, idryomov, kseeger, muagarwa, rraja, sangadi, tserlin, vashastr, vereddy |
| Target Milestone: | --- | ||
| Target Release: | 7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-18.2.0-9.el9cp | Doc Type: | Bug Fix |
| Doc Text: |
.The `librbd` client correctly propagates the block-listing error to the caller
Previously, when the `rbd_support` module's RADOS client was block-listed, the module's `mirror_snapshot_schedule` handler would not always shut down correctly. The handler's `librbd` client would not propagate the block-list error, thereby stalling the handler's shutdown. This lead to the failures of the `mirror_snapshot_schedule` handler and the `rbd_support` module to automatically recover from repeated client block-listing. The `rbd_support` module stopped scheduling mirror snapshots after its client was repeatedly block-listed.
With this fix, the race in the `librbd` client between its exclusive lock acquisition and handling of block-listing is fixed. This allows the `librbd` client to propagate the block-listing error correctly to the caller, for example, the `mirror_snapshot_schedule` handler, while waiting to acquire an exclusive lock. The `mirror_snapshot_schedule` handler and the `rbd_support_module` automatically recover from repeated client block-listing.
|
Story Points: | --- |
| Clone Of: | 2211290 | Environment: | |
| Last Closed: | 2023-12-13 15:22:41 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2237662 | ||
|
Description
Ilya Dryomov
2023-09-04 16:19:48 UTC
Hi Ram. Could you please confirm if this BZ needs to be added in the 7.0 RN? If so, please provide the doc type and text. Thanks. Hi Akash, Yes, you can add the BZ to the release notes. The doc text is the same as the one I provided for the BZ it was cloned from, https://bugzilla.redhat.com/show_bug.cgi?id=2211290 . I've copy pasted it here. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:7780 |