Bug 1499202

Summary: self-heal daemon stuck
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: replicateAssignee: bugs <bugs>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.12CC: bugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-glusterfs-3.12.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1493415 Environment:
Last Closed: 2017-10-13 12:47:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1493415    
Bug Blocks: 1491574, 1492782    

Description Ravishankar N 2017-10-06 11:21:49 UTC
+++ This bug was initially created as a clone of Bug #1493415 +++

Problem:
    If a brick crashes after an entry (file or dir) is created but before
    gfid is assigned, the good bricks will have pending entry heal xattrs
    but the heal won't complete because afr_selfheal_recreate_entry() tries
    to create the entry again and it fails with EEXIST.

--- Additional comment from Worker Ant on 2017-09-20 03:19:23 EDT ---

REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#1) for review on master by Ravishankar N (ravishankar@redhat.com)

--- Additional comment from Worker Ant on 2017-09-20 07:30:37 EDT ---

REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#2) for review on master by Ravishankar N (ravishankar@redhat.com)

--- Additional comment from Worker Ant on 2017-09-24 07:21:26 EDT ---

REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#3) for review on master by Ravishankar N (ravishankar@redhat.com)

--- Additional comment from Worker Ant on 2017-10-03 03:21:54 EDT ---

REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#4) for review on master by Ravishankar N (ravishankar@redhat.com)

--- Additional comment from Worker Ant on 2017-10-05 06:19:22 EDT ---

REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#5) for review on master by Ravishankar N (ravishankar@redhat.com)

--- Additional comment from Worker Ant on 2017-10-06 03:17:12 EDT ---

REVIEW: https://review.gluster.org/18326 (afr: heal gfid as a part of entry heal) posted (#6) for review on master by Ravishankar N (ravishankar@redhat.com)

Comment 1 Worker Ant 2017-10-09 06:52:04 UTC
REVIEW: https://review.gluster.org/18449 (afr: heal gfid as a part of entry heal) posted (#1) for review on release-3.12 by Ravishankar N (ravishankar@redhat.com)

Comment 2 Worker Ant 2017-10-10 05:33:19 UTC
COMMIT: https://review.gluster.org/18449 committed in release-3.12 by jiffin tony Thottan (jthottan@redhat.com) 
------
commit f73814ad08d552d94d0139b2592175d206e7a166
Author: Ravishankar N <ravishankar@redhat.com>
Date:   Wed Sep 20 12:16:06 2017 +0530

    afr: heal gfid as a part of entry heal
    
    Problem:
    If a brick crashes after an entry (file or dir) is created but before
    gfid is assigned, the good bricks will have pending entry heal xattrs
    but the heal won't complete because afr_selfheal_recreate_entry() tries
    to create the entry again and it fails with EEXIST.
    
    Fix:
    We could have fixed posx_mknod/mkdir etc to assign the gfid if the file
    already exists but the right thing to do seems to be to trigger a lookup
    on the bad brick and let it heal the gfid instead of winding an
    mknod/mkdir in the first place.
    
    (cherry picked from commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265)
    Change-Id: I82f76665a7541f1893ef8d847b78af6466aff1ff
    BUG: 1499202
    Signed-off-by: Ravishankar N <ravishankar@redhat.com>

Comment 3 Jiffin 2017-10-13 12:47:15 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-glusterfs-3.12.2, please open a new bug report.

glusterfs-glusterfs-3.12.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-October/032684.html
[2] https://www.gluster.org/pipermail/gluster-users/