Bug 1112348 - [AFR] I/O fails when one of the replica nodes go down
Summary: [AFR] I/O fails when one of the replica nodes go down
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.5.1
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On: 1066389 1106408
Blocks: glusterfs-3.5.2
TreeView+ depends on / blocked
 
Reported: 2014-06-23 16:31 UTC by Pranith Kumar K
Modified: 2014-07-31 11:43 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.5.2beta1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1106408
Environment:
Last Closed: 2014-07-31 11:43:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Anand Avati 2014-06-23 16:33:15 UTC
REVIEW: http://review.gluster.org/8154 (cluster/afr: Fix resolution issues with afr) posted (#1) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2014-06-24 16:26:22 UTC
COMMIT: http://review.gluster.org/8154 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit 5d2603c75bfd78c8b33903a20e844430276e7539
Author: Pranith Kumar K <pkarampu>
Date:   Mon Jun 23 21:35:29 2014 +0530

    cluster/afr: Fix resolution issues with afr
    
    Problem with afr:
    Lets say there is a directory hierarchy a/b/c/d on the mount and the
    user is cd'ed into the directory. Bring down one of the bricks of replica and
    remove all directories/files to simulate disk replacement on that brick. Now
    this brick is brought back up. Creates on the cd'ed directory fail with ESTALE.
    Basically before sending a create of 'f' inside 'd', fuse sends a lookup to
    make sure the file is not present.  On one of the bricks  'd' is present and
    'f' is not so it sends ENOENT as response. On the new brick 'd' itself is not
    present. So it sends ESTALE. In afr ESTALE is considered to be special errno on
    witnessing which lookup has to fail. And ESTALE is given more priority than
    ENOENT. Due to these reasons lookup fails with ESTALE rather than ENOENT. Since
    lookup didn't fail with ENOENT, 'create' can't be issued so the command is
    failed with ESTALE.
    
    Solution:
    Afr needs to consider ESTALE errno normally and ENOENT needs to
    be given more priority so that operations like create can proceed even when
    only one of the brick is up and running. Whenever client xlator identifies
    that gfid-changed, it sets that information in lookup xdata. Afr uses this
    information to fail the lookup with ESTALE so that top xlator can send
    fresh lookup.
    
    Change-Id: Ie8e0e327542fd644409eb5dadf451679afa1c0e5
    BUG: 1112348
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/8154
    Tested-by: Justin Clift <justin>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Ravishankar N <ravishankar>
    Reviewed-by: Niels de Vos <ndevos>

Comment 3 Niels de Vos 2014-07-21 15:41:43 UTC
The first (and last?) Beta for GlusterFS 3.5.2 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.2beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-July/041636.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 4 Niels de Vos 2014-07-31 11:43:14 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report.

glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.