Bug 2111364
| Summary: | [rbd_support] recover from RADOS instance blocklisting | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Ilya Dryomov <idryomov> |
| Component: | RBD-Mirror | Assignee: | Ram Raja <rraja> |
| Status: | CLOSED ERRATA | QA Contact: | Vasishta <vashastr> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 5.0 | CC: | amagrawa, bniver, ceph-eng-bugs, cephqe-warriors, ekuric, idryomov, jdurgin, jespy, kramdoss, kseeger, mmuench, mmurthy, muagarwa, ocs-bugs, owasserm, prsurve, sagrawal, sostapov, srangana, tserlin, vashastr, vereddy |
| Target Milestone: | --- | Keywords: | TestBlocker |
| Target Release: | 6.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-17.2.6-47.el9cp | Doc Type: | Bug Fix |
| Doc Text: |
In certain scenarios, the OSDs were slow to process RBD requests. This prevented the rbd_support module's RBD client from being able to gracefully handover a RBD exclusive lock to another RBD client. When this condition persisted, the other RBD client forcefully acquired the lock by blocklisting the module's RADOS client. Consequently, the rbd_support module stopped working, e.g., stopped scheduling mirror snapshots. The rbd_support module had to be recovered by manually restarting ceph-mgr which disrupted the operations of other mgr modules that were reloaded.
On the module's RADOS client being blocklisted, instead of restarting the ceph-mgr, the rbd_support module automatically recovers. The recovery process involves shutting down the module's handlers, creating a new RADOS client for the module, and restarting the handlers. After recovery, the rbd_support module starts to serve request as before.
|
Story Points: | --- |
| Clone Of: | 2067095 | Environment: | |
| Last Closed: | 2023-06-15 09:15:33 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2067095, 2192813 | ||
|
Comment 34
Scott Ostapovicz
2023-02-06 16:54:44 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 6.1 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:3623 |