Bug 1141733

Summary: data loss when rebalance + renames are in progress and bricks from replica pairs goes down and comes back
Product: [Community] GlusterFS Reporter: Pranith Kumar K <pkarampu>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.5.3CC: bugs, nsathyan, shmohan, ssamanta, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.5.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1141539 Environment:
Last Closed: 2014-11-21 16:02:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1140643, 1141539, 1142020, 1144450, 1151308    
Bug Blocks: 1125231    

Comment 1 Anand Avati 2014-09-15 10:57:21 UTC
REVIEW: http://review.gluster.org/8739 (cluster/afr: Handle EAGAIN properly in inodelk) posted (#1) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2014-09-20 16:09:29 UTC
REVIEW: http://review.gluster.org/8739 (cluster/afr: Handle EAGAIN properly in inodelk) posted (#2) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)

Comment 3 Anand Avati 2014-09-29 07:00:35 UTC
COMMIT: http://review.gluster.org/8739 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit f67921dec0ab5db6c7c0bc1b00459dbcfa1d568d
Author: Pranith Kumar K <pkarampu>
Date:   Mon Sep 15 14:22:44 2014 +0530

    cluster/afr: Handle EAGAIN properly in inodelk
    
    Problem:
    When one of the brick is taken down and brough back up in a replica pair, locks
    on that brick will be allowed. Afr returns inodelk success even when one of the
    bricks already has the lock taken.
    
    Fix:
    If any brick returns EAGAIN return failure to parent xlator.
    
    Note: This change only works for non-blocking inodelks. This patch addresses
    dht-synchronization which uses non-blocking locks for rename. Blocking lock is
    issued by only one of the rebalance processes. So for now there is no
    possibility of deadlock.
    
    Change-Id: I07673f8873263da334e03f35c6cdb5db9410a616
    BUG: 1141733
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/8739
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Niels de Vos <ndevos>

Comment 4 Niels de Vos 2014-10-05 13:00:14 UTC
The first (and last?) Beta for GlusterFS 3.5.3 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.3beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-October/018990.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 5 Niels de Vos 2014-11-05 09:24:52 UTC
The second Beta for GlusterFS 3.5.3 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.3beta2 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions have been made available on [2] to make testing easier.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019359.html
[2] http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.5.3beta2/

Comment 6 Niels de Vos 2014-11-21 16:02:51 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.3, please reopen this bug report.

glusterfs-3.5.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/announce/2014-November/000042.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/