Bug 1118574

Summary: mkdir on fuse mount failed with "Stale file handle" while adding bricks to volume.
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: protocolAssignee: Ravishankar N <ravishankar>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.5.1CC: gluster-bugs, nsathyan, racpatel, rgowdapp, spandura
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.5.2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1116376 Environment:
Last Closed: 2014-09-16 19:44:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1116376    
Bug Blocks:    

Description Ravishankar N 2014-07-11 04:09:30 UTC
+++ This bug was initially created as a clone of Bug #1116376 +++

Description of problem:
-------------------------
mkdir on few directories failed on a distribute-replicate volume when bricks were added to the volume. 

root@dj [Jul-04-2014-15:41:44] >mkdir -p A{101..200}/B{1..5}/C{1..10}
mkdir: cannot create directory `A105/B5/C6': Stale file handle
mkdir: cannot create directory `A105/B5/C9': Stale file handle

root@dj [Jul-04-2014-16:49:18] >ls -l A105/B5/C6
ls: cannot access A105/B5/C6: No such file or directory

root@dj [Jul-04-2014-16:49:21] >ls -l A105/B5/C9
ls: cannot access A105/B5/C9: No such file or directory
root@dj [Jul-04-2014-16:49:24] >

Version-Release number of selected component (if applicable):
=================================================================
glusterfs 3.6.0.22 built on Jun 23 2014 10:33:07

How reproducible:
=================
First time saw the error and reporting it. 

Steps to Reproduce:
====================
1. Create a dis-rep volume ( 2 x 2 ) . Start the volume. 

2. Create fuse/nfs mount. start creating files/dirs on all the mounts. 

3. From one fuse mount execute : "mkdir -p A{101..200}/B{1..5}/C{1..10}". (No other mounts were create any files/directories on the above newly created dirs. They were doing io's on other dirs/files)

4. While mkdir is in process, add-bricks to the volume : 


Actual results:
=================
mkdir failed with "Stale File Handle"

root@dj [Jul-04-2014-15:41:44] >mkdir -p A{101..200}/B{1..5}/C{1..10}
mkdir: cannot create directory `A105/B5/C6': Stale file handle
mkdir: cannot create directory `A105/B5/C9': Stale file handle


Expected results:
====================
mkdir shouldn't fail.


comment from Raghavendra G:
---------------------------
Till [1], the layouts of all the directories created after an add-brick, included the newly added brick. However, this behaviour had also resulted in a bug [2]. Consider the following set of operations:

1. cd /mnt/glusterfs
2. mkdirp -p 1/2/3/4/5/6
3. cd 1/2/3/4/5/6
4. add a new brick
5. mkdir -p 7/<a-brick-name-which-hashes-to-newly-added-brick>

Here mkdir of 7/<a-brick-name-which-hashes-to-newly-added-brick> fails with ENOENT. This is because on the newly added-brick there is no guarantee that /mnt/glusterfs/1/2/3/4/5/6/7 is created - mkdir (7) on newly-added-brick could've returned ENOENT because of its parents not being present - and layout of directory 7 used to include the newly added brick. Since mkdir (<a-brick-name-which-hashes-to-newly-added-brick>) fails on hashed-subvol, the directory creation is aborted and an ENOENT is returned to application. This is the RCA of [2].

Now, with [1], the errno returned as part of failure of mkdir (7) on newly-added-brick is ESTALE and dht doesn't consider the newly-added-brick as part of the layout of directory 7 and hence no failures are seen while creating children of 7 and [2] is fixed. However [3] seems to have nullified effect of [1] even in newer client. If I revert [3], the issue seems fixed. Also this bug is seen only in release-3.5 branches where [3] is merged, but not on master where [3] is not present.

[1] http://review.gluster.org/6318
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1006809
[3] http://review.gluster.org/#/c/8080/

Comment 1 Anand Avati 2014-07-11 04:13:44 UTC
REVIEW: http://review.gluster.org/8294 (protocol/server: '/s/ESTALE/ENOENT' only in lookup path) posted (#1) for review on release-3.5 by Ravishankar N (ravishankar)

Comment 2 Anand Avati 2014-07-11 04:23:42 UTC
REVIEW: http://review.gluster.org/8294 (protocol/server: '/s/ESTALE/ENOENT' only in lookup path) posted (#2) for review on release-3.5 by Ravishankar N (ravishankar)

Comment 3 Anand Avati 2014-07-14 09:22:20 UTC
COMMIT: http://review.gluster.org/8294 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit 9d68bd17adf45b158ba8dd89f583805ae1a9e706
Author: Ravishankar N <ravishankar>
Date:   Wed Jul 9 23:19:06 2014 +0000

    protocol/server: '/s/ESTALE/ENOENT' only in lookup path
    
    Problem:
    [1] modified the server resolver code to send ENOENT instead of
    ESTALE to older clients for all FOPS. This caused dht_mkdir
    to fail under certain conditions (see bug description).
    
    Fix:
    Since [1] is needed by AFR only in its lookup path, reverted the changes
    introduced by [1]  in resolve_entry_simple () an resolve_inode_simple () and
    made the change instead in server_lookup_resume().
    
    [1] http://review.gluster.org/#/c/8080
    
    Change-Id: Idb2de25839fe712550486f2263a60c0531530d8f
    BUG: 1118574
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/8294
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Niels de Vos <ndevos>

Comment 4 Niels de Vos 2014-09-16 19:44:28 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report.

glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user