Bug 1266836

Summary: AFR : fuse,nfs mount hangs when directories with same names are created and deleted continuously
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: protocolAssignee: Ravishankar N <ravishankar>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.5CC: bugs, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1266834 Environment:
Last Closed: 2015-11-17 05:58:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1266834    
Bug Blocks: 986916, 1261706, 1275914, 1286582, 1338634, 1338668, 1338669    

Description Ravishankar N 2015-09-28 07:22:53 UTC
+++ This bug was initially created as a clone of Bug #1266834 +++

spandura 2013-07-22 08:24:21 EDT

Description of problem:
=========================
In a distribute-replicate volume, when directories with same names are created and deleted continuously on fuse and nfs mount points, after certain time the mount points hang. 

Refer to bug: 922792

Version-Release number of selected component (if applicable):
==============================================================
root@rhs-client11 [Jul-22-2013-16:00:29] >rpm -qa | grep glusterfs-server
glusterfs-server-3.4.0.12rhs.beta5-2.el6rhs.x86_64

root@rhs-client11 [Jul-22-2013-16:00:40] >gluster --version
glusterfs 3.4.0.12rhs.beta5 built on Jul 18 2013 07:00:39

How reproducible:

test_bug_922792.sh
===================
#!/bin/bash

dir=$(dirname $(readlink -f $0))
echo 'Script in '$dir
while :
do
        mkdir -p foo$1/bar/gee
        mkdir -p foo$1/bar/gne
        mkdir -p foo$1/lna/gme
        rm -rf foo$1
done

Steps to Reproduce:
===================
1. Create a distribute-replicate volume ( 6 x 2 ). 4 storage nodes . 3 bricks on each storage node. 

2. Start the volume.

3. Create 2 fuse and 2 nfs mounts on each RHEL5.9 and RHEL6.4 clients. 

4. From all the mount points execute: "test_bug_922792.sh" 

Actual results:
===============
After sometime, fuse and nfs mount hangs. 

Expected results:
================
Fuse and nfs mount shouldn't hang.

-------------------------------------------
 Krutika Dhananjay 2014-07-28 04:53:50 EDT

Couple of observations:

Tried this on a 2x2 volume with 2 nfs and 2 fuse mounts and the bug is 100% reproducible.

The hang stems from one of the clients failing to unlock an inodelk it had held before, on which the hung client is blocked forever. This can be deduced from the presence of the following log messages:

[root@calvin glusterfs]# grep -T 'unlock' nfs.log mnt-glusterfs.log
nfs.log:[2014-07-20 04:34:47.247058] I [afr-lk-common.c:676:afr_unlock_inodelk_cbk] 0-dis-rep-replicate-0: /foo/lna/gme: unlock failed on subvolume dis-rep-client-0 with lock owner d8a0227c237f0000. Reason : Stale file handle
nfs.log:[2014-07-20 04:34:47.247390] I [afr-lk-common.c:676:afr_unlock_inodelk_cbk] 0-dis-rep-replicate-0: /foo/lna/gme: unlock failed on subvolume dis-rep-client-1 with lock owner d8a0227c237f0000. Reason : Stale file handle


The ESTALE error on unlock is originating from server resolver, suggestive of the fact that the inode had been unlinked and is no longer part of the inode table.

Furthermore, I added GF_ASSERT in afr_unlock_inodelk_cbk() to deliberately crash the client with SIGABRT on unlock failure.
The core suggests that (local->loc).gfid and (local->loc).inode->gfid are both different when they should both ideally be one and the same.

All this seems to suggest that this is possibly due to existing races between DHT's lookup self-heal/rmdir/mkdir codepaths.

--- Additional comment from Ravishankar N on 2015-09-28 03:17:28 EDT ---

review.gluster.org/#/c/12233/

Comment 1 Ravishankar N 2015-09-28 07:32:07 UTC
http://review.gluster.org/#/c/12234/

Comment 2 Vijay Bellur 2015-10-13 11:26:39 UTC
COMMIT: http://review.gluster.org/12234 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) 
------
commit 1a1b00fcd0ec199d19652d8fceb9465cc4edf189
Author: Ravishankar N <ravishankar>
Date:   Wed Sep 16 16:35:19 2015 +0530

    protocol/client: give preference to loc->gfid over inode->gfid
    
    Backport of review.gluster.org/#/c/12233/
    
    There are xlators which perform fops even before inode gets linked. Because of this loc.gfid is preferred at the time of inodelk/entrylk but by the time unlock can happen, inode could be linked with a different gfid than the one in loc.gfid (because of the way dht was giving preference) Due to this unlock goes on a different inode than the one we sent inodelk on, which leads to hang.
    
    Credits to Pranith for the fix.
    
    Change-Id: I7d162d44852ba876f35aa1bb83e4afdb184d85b9
    BUG: 1266836
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/12234
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 3 Raghavendra Talur 2015-11-17 05:58:48 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.6, please open a new bug report.

glusterfs-3.7.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-users/2015-November/024359.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user