Bug 1114033 - DHT :- file creation failed with 'Stale file handle' on nfs mount (all sub-volumes were up, parent Directory was not created on all sub-volumes)
Summary: DHT :- file creation failed with 'Stale file handle' on nfs mount (all sub-vo...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Satish Mohan
QA Contact:
URL:
Whiteboard: triaged h
Depends On: 1278399
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-27 13:52 UTC by Rachana Patel
Modified: 2017-01-02 15:19 UTC (History)
5 users (show)

Fixed In Version: 3.7.9-10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-02 15:19:56 UTC
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2014-06-27 13:52:29 UTC
Description of problem:
=======================
DHT :- file creation failed with 'Stale file handle' on nfs moun(all sub-volumes were up, parent Directory was not created on all sub-volumes)

--->  trusted.glusterfs.dht xattr was not created for directory on any sub-volumes and file creation inside that directory was initiated ( was trying to verify Bug 1030309)

Version-Release number of selected component (if applicable):
=============================================================
3.6.0.19-1.el6rhs.x86_64

How reproducible:
=================
intermittent

Steps to Reproduce:
===================
1. create Distributed volume(3 bricks) and mount it on multiple client(NFS & FUSE)
[root@OVM3 nfs]# gluster v status snap
Status of volume: snap
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.35.198:/brick2/b1				49157	Y	13611
Brick 10.70.35.198:/brick2/b2				49158	Y	13345
Brick 10.70.35.198:/brick2/b3				49159	Y	13356
NFS Server on localhost					2049	Y	13623
NFS Server on 10.70.35.240				2049	Y	19185
NFS Server on 10.70.35.172				2049	Y	13465
 
Task Status of Volume snap
------------------------------------------------------------------------------
There are no active volume tasks


[to reproduce race, we are putting break point]

2. from one client - FUSE start creating Directory
mkdir dir2

3. From another mount point - NFS mount,  create file inside that Directory

[root@OVM3 nfs]# touch dir2/f1
touch: cannot touch `dir2/f1': Stale file handle


Actual results:
===============
File creation faild with 'Stale file handle'


Expected results:
=================
Lookup should heal parent Directory on all up Sub-volumes and File creation should not fail with 'Stale File handle', File should be created.


Additional info:
===============
Log snippet :-

[2014-06-27 07:41:06.637203] I [dht-layout.c:663:dht_layout_normalize] 0-snap-dht: Found anomalies in <gfid:c0a48017-ec23-4c93-b6bd-31
1a8a814ae8> (gfid = c0a48017-ec23-4c93-b6bd-311a8a814ae8). Holes=1 overlaps=0
[2014-06-27 07:41:06.637830] E [dht-helper.c:813:dht_migration_complete_check_task] 0-snap-dht: <gfid:c0a48017-ec23-4c93-b6bd-311a8a81
4ae8>: failed to lookup the file on snap-client-0
[2014-06-27 07:41:06.637868] W [nfs3.c:1532:nfs3svc_access_cbk] 0-nfs: 265db38b: <gfid:c0a48017-ec23-4c93-b6bd-311a8a814ae8> => -1 (St
ale file handle)
[2014-06-27 07:41:06.637889] W [nfs3-helpers.c:3401:nfs3_log_common_res] 0-nfs-nfsv3: XID: 265db38b, ACCESS: NFS: 70(Invalid file hand
le), POSIX: 116(Stale file handle)
[2014-06-27 07:41:06.638577] W [client-rpc-fops.c:1354:client3_3_access_cbk] 0-snap-client-0: remote operation failed: Stale file hand
le
[2014-06-27 07:41:06.640590] I [dht-layout.c:663:dht_layout_normalize] 0-snap-dht: Found anomalies in <gfid:c0a48017-ec23-4c93-b6bd-31
1a8a814ae8> (gfid = c0a48017-ec23-4c93-b6bd-311a8a814ae8). Holes=1 overlaps=0
[2014-06-27 07:41:06.642105] W [dht-layout.c:180:dht_layout_search] 0-snap-dht: no subvolume for hash (value) = 3551819610
[2014-06-27 07:41:06.642969] W [nfs3.c:1230:nfs3svc_lookup_cbk] 0-nfs: 285db38b: <gfid:c0a48017-ec23-4c93-b6bd-311a8a814ae8>/f1 => -1 
(Stale file handle)
[2014-06-27 07:41:06.643011] W [nfs3-helpers.c:3470:nfs3_log_newfh_res] 0-nfs-nfsv3: XID: 285db38b, LOOKUP: NFS: 70(Invalid file handl
e), POSIX: 116(Stale file handle), FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000
[2014-06-27 07:41:06.644619] W [dht-layout.c:180:dht_layout_search] 0-snap-dht: no subvolume for hash (value) = 3551819610
[2014-06-27 07:41:06.645413] W [nfs3.c:1230:nfs3svc_lookup_cbk] 0-nfs: 2b5db38b: <gfid:c0a48017-ec23-4c93-b6bd-311a8a814ae8>/f1 => -1 
(Stale file handle)

Comment 4 Raghavendra G 2015-12-30 05:06:24 UTC
Similar to bz 1278399.

Fixed by:
https://code.engineering.redhat.com/gerrit/#/c/61036/

Fixed in 3.1.2

Comment 6 Prasad Desala 2016-08-08 05:53:44 UTC
This issue is not seen on gluster build: 3.7.9-10.el7rhgs.x86_64.

Here are the steps that were followed,

1. Created a  distributed replica 4x2 volume and mounted it on multiple clients (NFS & FUSE).
2. To reproduce race, kept break point at dht_mkdir_hashed_cbk from FUSE and started creating directory,
mkdir test_fuse
3. From NFS mount, created a file inside the directory "test_fuse"
touch test_fuse/f1

File is created from NFS mount without any issues/errors.

Hence, marking this bug as Verified.


Note You need to log in before you can comment on or make changes to this bug.