Bug 1398566
| Summary: | self-heal info command hangs after triggering self-heal | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Krutika Dhananjay <kdhananj> | |
| Component: | replicate | Assignee: | Krutika Dhananjay <kdhananj> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | mainline | CC: | amukherj, bugs, rhs-bugs, sasundar, storage-qa-internal | |
| Target Milestone: | --- | Keywords: | Triaged | |
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.10.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1396166 | |||
| : | 1398888 (view as bug list) | Environment: |
RHV-RHGS HCI
|
|
| Last Closed: | 2017-03-06 17:36:17 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1396166 | |||
| Bug Blocks: | 1398888 | |||
|
Description
Krutika Dhananjay
2016-11-25 09:55:28 UTC
REVIEW: http://review.gluster.org/15929 (cluster/afr: Fix deadlock due to compound fops) posted (#1) for review on master by Krutika Dhananjay (kdhananj) COMMIT: http://review.gluster.org/15929 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 2fe8ba52108e94268bc816ba79074a96c4538271 Author: Krutika Dhananjay <kdhananj> Date: Fri Nov 25 15:54:30 2016 +0530 cluster/afr: Fix deadlock due to compound fops When an afr data transaction is eligible for using eager-lock, this information is represented in local->transaction.eager_lock_on. However, if non-blocking inodelk attempt (which is a full lock) fails, AFR falls back to blocking locks which are range locks. At this point, local->transaction.eager_lock[] per brick is reset but local->transaction.eager_lock_on is still true. When AFR decides to compound post-op and unlock, it is after confirming that the transaction did not use eager lock (well, except for a small bug where local->transaction.locks_acquired[] is not considered). But within afr_post_op_unlock_do(), afr again incorrectly sets the lock range to full-lock based on local->transaction.eager_lock_on value. This is a bug and can lead to deadlock since the locks acquired were range locks and a full unlock is being sent leading to unlock failure and thereby every other lock request (be it from SHD or other clients or glfsheal) getting blocked forever and the user perceives a hang. FIX: Unconditionally rely on the range locks in inodelk object for unlocking when using compounded post-op + unlock. Big thanks to Pranith for helping with the debugging. Change-Id: Idb4938f90397fb4bd90921f9ae6ea582042e5c67 BUG: 1398566 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/15929 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/ |