Bug 1049932
Summary: | AFR : self-heal of few files not happening when a AWS EC2 Instance is back online after a restart | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Pranith Kumar K <pkarampu> | |
Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | mainline | CC: | bugs, gluster-bugs, grajaiya, pkarampu, ravishankar, shaines, spandura, vagarwal, vbellur | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.6.0beta1 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1044923 | |||
: | 1113894 (view as bug list) | Environment: | ||
Last Closed: | 2014-11-11 08:26:45 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1044923 | |||
Bug Blocks: | 1113894 |
Description
Pranith Kumar K
2014-01-08 14:07:18 UTC
REVIEW: http://review.gluster.org/6669 (protocol/client: conn-id should be unique when lk-heal is off) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) COMMIT: http://review.gluster.org/6669 committed in master by Vijay Bellur (vbellur) ------ commit 4a14159e82d7b736dec686a170b06e961d7aff53 Author: Pranith Kumar K <pkarampu> Date: Wed Jan 8 17:01:44 2014 +0530 protocol/client: conn-id should be unique when lk-heal is off Problem: It was observed that in some cases client disconnects and re-connects before server xlator could detect that a disconnect happened. So it still uses previous fdtable and ltable. But it can so happen that in between disconnect and re-connect an 'unlock' fop may fail because the fds are marked 'bad' in client xlator upon disconnect. Due to this stale locks remain on the brick which lead to hangs/self-heals not happening etc. For the exact bug RCA please look at https://bugzilla.redhat.com/show_bug.cgi?id=1049932#c0 Fix: When lk-heal is not enabled make sure connection-id is different for every setvolume. This will make sure that a previous connection's resources are not re-used in server xlator. Change-Id: Id844aaa76dfcf2740db72533bca53c23b2fe5549 BUG: 1049932 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/6669 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krishnan Parthasarathi <kparthas> Reviewed-by: Vijay Bellur <vbellur> A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/ This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users |