Bug 1674549
Summary: | [cee/sd][ceph-mgr] luminous: deadlock in standby ceph-mgr daemons | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tomas Petr <tpetr> |
Component: | RADOS | Assignee: | Brad Hubbard <bhubbard> |
Status: | CLOSED ERRATA | QA Contact: | Manohar Murthy <mmurthy> |
Severity: | medium | Docs Contact: | Aron Gunn <agunn> |
Priority: | medium | ||
Version: | 3.2 | CC: | agunn, anharris, assingh, bhubbard, branto, ceph-eng-bugs, dzafman, gsitlani, jbrier, kchai, mhackett, nojha, tchandra, tserlin, vumrao |
Target Milestone: | z2 | ||
Target Release: | 3.2 | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHEL: ceph-12.2.8-113.el7cp Ubuntu: ceph_12.2.8-96redhat1xenial | Doc Type: | Bug Fix |
Doc Text: |
.A race condition was causing threads to deadlock with the standby `ceph-mgr` daemon
Some threads can cause a race condition when acquiring a local lock and the Python global interpreter lock, which is causing a deadlock issue for each thread. As the thread holds on to one of the locks, it wants to acquire the other lock, but cannot. In this release, the code was fixed to close the window of opportunity for the race condition to occur. This is done by changing the location of the lock acquisition and releasing the appropriate locks. Doing this results in the threads not causing a deadlock, which allows progress to be made.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2019-04-30 15:56:46 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1629656 |
Description
Tomas Petr
2019-02-11 15:40:22 UTC
See analysis in https://tracker.ceph.com/issues/35985 We are still waiting on thread dumps to confirm this issue is the same as https://tracker.ceph.com/issues/35985. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0911 |