Bug 1493415 - self-heal daemon stuck
Summary: self-heal daemon stuck
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1492782 1499202
TreeView+ depends on / blocked
 
Reported: 2017-09-20 07:18 UTC by Ravishankar N
Modified: 2017-12-08 17:41 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.13.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1499202 (view as bug list)
Environment:
Last Closed: 2017-12-08 17:41:21 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ravishankar N 2017-09-20 07:18:27 UTC
Problem:
    If a brick crashes after an entry (file or dir) is created but before
    gfid is assigned, the good bricks will have pending entry heal xattrs
    but the heal won't complete because afr_selfheal_recreate_entry() tries
    to create the entry again and it fails with EEXIST.

Comment 1 Worker Ant 2017-09-20 07:19:23 UTC
REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#1) for review on master by Ravishankar N (ravishankar)

Comment 2 Worker Ant 2017-09-20 11:30:37 UTC
REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#2) for review on master by Ravishankar N (ravishankar)

Comment 3 Worker Ant 2017-09-24 11:21:26 UTC
REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#3) for review on master by Ravishankar N (ravishankar)

Comment 4 Worker Ant 2017-10-03 07:21:54 UTC
REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#4) for review on master by Ravishankar N (ravishankar)

Comment 5 Worker Ant 2017-10-05 10:19:22 UTC
REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#5) for review on master by Ravishankar N (ravishankar)

Comment 6 Worker Ant 2017-10-06 07:17:12 UTC
REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#6) for review on master by Ravishankar N (ravishankar)

Comment 7 Worker Ant 2017-10-09 06:23:12 UTC
COMMIT: https://review.gluster.org/18326 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265
Author: Ravishankar N <ravishankar>
Date:   Wed Sep 20 12:16:06 2017 +0530

    afr: heal gfid as a part of entry heal
    
    Problem:
    If a brick crashes after an entry (file or dir) is created but before
    gfid is assigned, the good bricks will have pending entry heal xattrs
    but the heal won't complete because afr_selfheal_recreate_entry() tries
    to create the entry again and it fails with EEXIST.
    
    Fix:
    We could have fixed posx_mknod/mkdir etc to assign the gfid if the file
    already exists but the right thing to do seems to be to trigger a lookup
    on the bad brick and let it heal the gfid instead of winding an
    mknod/mkdir in the first place.
    
    Change-Id: I82f76665a7541f1893ef8d847b78af6466aff1ff
    BUG: 1493415
    Signed-off-by: Ravishankar N <ravishankar>

Comment 8 Shyamsundar 2017-12-08 17:41:21 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report.

glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.