Description of problem: ======================= DHT :- file creation failed with 'Stale file handle' on nfs moun(all sub-volumes were up, parent Directory was not created on all sub-volumes) ---> trusted.glusterfs.dht xattr was not created for directory on any sub-volumes and file creation inside that directory was initiated ( was trying to verify Bug 1030309) Version-Release number of selected component (if applicable): ============================================================= 3.6.0.19-1.el6rhs.x86_64 How reproducible: ================= intermittent Steps to Reproduce: =================== 1. create Distributed volume(3 bricks) and mount it on multiple client(NFS & FUSE) [root@OVM3 nfs]# gluster v status snap Status of volume: snap Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.198:/brick2/b1 49157 Y 13611 Brick 10.70.35.198:/brick2/b2 49158 Y 13345 Brick 10.70.35.198:/brick2/b3 49159 Y 13356 NFS Server on localhost 2049 Y 13623 NFS Server on 10.70.35.240 2049 Y 19185 NFS Server on 10.70.35.172 2049 Y 13465 Task Status of Volume snap ------------------------------------------------------------------------------ There are no active volume tasks [to reproduce race, we are putting break point] 2. from one client - FUSE start creating Directory mkdir dir2 3. From another mount point - NFS mount, create file inside that Directory [root@OVM3 nfs]# touch dir2/f1 touch: cannot touch `dir2/f1': Stale file handle Actual results: =============== File creation faild with 'Stale file handle' Expected results: ================= Lookup should heal parent Directory on all up Sub-volumes and File creation should not fail with 'Stale File handle', File should be created. Additional info: =============== Log snippet :- [2014-06-27 07:41:06.637203] I [dht-layout.c:663:dht_layout_normalize] 0-snap-dht: Found anomalies in <gfid:c0a48017-ec23-4c93-b6bd-31 1a8a814ae8> (gfid = c0a48017-ec23-4c93-b6bd-311a8a814ae8). Holes=1 overlaps=0 [2014-06-27 07:41:06.637830] E [dht-helper.c:813:dht_migration_complete_check_task] 0-snap-dht: <gfid:c0a48017-ec23-4c93-b6bd-311a8a81 4ae8>: failed to lookup the file on snap-client-0 [2014-06-27 07:41:06.637868] W [nfs3.c:1532:nfs3svc_access_cbk] 0-nfs: 265db38b: <gfid:c0a48017-ec23-4c93-b6bd-311a8a814ae8> => -1 (St ale file handle) [2014-06-27 07:41:06.637889] W [nfs3-helpers.c:3401:nfs3_log_common_res] 0-nfs-nfsv3: XID: 265db38b, ACCESS: NFS: 70(Invalid file hand le), POSIX: 116(Stale file handle) [2014-06-27 07:41:06.638577] W [client-rpc-fops.c:1354:client3_3_access_cbk] 0-snap-client-0: remote operation failed: Stale file hand le [2014-06-27 07:41:06.640590] I [dht-layout.c:663:dht_layout_normalize] 0-snap-dht: Found anomalies in <gfid:c0a48017-ec23-4c93-b6bd-31 1a8a814ae8> (gfid = c0a48017-ec23-4c93-b6bd-311a8a814ae8). Holes=1 overlaps=0 [2014-06-27 07:41:06.642105] W [dht-layout.c:180:dht_layout_search] 0-snap-dht: no subvolume for hash (value) = 3551819610 [2014-06-27 07:41:06.642969] W [nfs3.c:1230:nfs3svc_lookup_cbk] 0-nfs: 285db38b: <gfid:c0a48017-ec23-4c93-b6bd-311a8a814ae8>/f1 => -1 (Stale file handle) [2014-06-27 07:41:06.643011] W [nfs3-helpers.c:3470:nfs3_log_newfh_res] 0-nfs-nfsv3: XID: 285db38b, LOOKUP: NFS: 70(Invalid file handl e), POSIX: 116(Stale file handle), FH: exportid 00000000-0000-0000-0000-000000000000, gfid 00000000-0000-0000-0000-000000000000 [2014-06-27 07:41:06.644619] W [dht-layout.c:180:dht_layout_search] 0-snap-dht: no subvolume for hash (value) = 3551819610 [2014-06-27 07:41:06.645413] W [nfs3.c:1230:nfs3svc_lookup_cbk] 0-nfs: 2b5db38b: <gfid:c0a48017-ec23-4c93-b6bd-311a8a814ae8>/f1 => -1 (Stale file handle)
Similar to bz 1278399. Fixed by: https://code.engineering.redhat.com/gerrit/#/c/61036/ Fixed in 3.1.2
This issue is not seen on gluster build: 3.7.9-10.el7rhgs.x86_64. Here are the steps that were followed, 1. Created a distributed replica 4x2 volume and mounted it on multiple clients (NFS & FUSE). 2. To reproduce race, kept break point at dht_mkdir_hashed_cbk from FUSE and started creating directory, mkdir test_fuse 3. From NFS mount, created a file inside the directory "test_fuse" touch test_fuse/f1 File is created from NFS mount without any issues/errors. Hence, marking this bug as Verified.