Bug 1005153

Summary: E [nfs3.c:762:nfs3_getattr_resume] 0-nfs-nfsv3: No such file or directory: (10.70.34.94:784) replicate : 00000000-0000-0000-0000-000000000000
Product: Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: gluster-nfsAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED DEFERRED QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.1CC: ndevos, pkarampu, rhinduja, rhs-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-27 09:22:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Rahul Hinduja 2013-09-06 10:21:21 UTC
Description of problem:
=======================

Found the following error messages when created a file from backend and accessed it from NFS mount. This was done to verify the functionality mentioned in BZ 798874. Though the files is accessible from FUSE mount, few fails from NFS mount as 

root@wingo [Sep-06-2013-15:37:46] >pwd
/mnt/n-rep
root@wingo [Sep-06-2013-15:37:47] >mount | grep n-rep
rhs-client11:replicate on /mnt/n-rep type nfs (rw,vers=3,addr=10.70.36.35)
root@wingo [Sep-06-2013-15:37:51] >ls test_file2
ls: cannot access test_file2: No such file or directory
root@wingo [Sep-06-2013-15:37:55] >

changelogs and gfid is set on the backend bricks:
=================================================

[root@rhs-client12 r2]# getfattr -d -e hex -m . test_file2
# file: test_file2
trusted.afr.replicate-client-0=0x000000000000000000000000
trusted.afr.replicate-client-1=0x000000000000000000000000
trusted.gfid=0xea883d8cc14b4e25a00c8ea997e7860b


[root@rhs-client11 r1]# getfattr -d -e hex -m . test_file2
# file: test_file2
trusted.afr.replicate-client-0=0x000000000000000000000000
trusted.afr.replicate-client-1=0x000000000000000000000000
trusted.gfid=0xea883d8cc14b4e25a00c8ea997e7860b


Errors in the NFS logs:
=======================

[2013-09-06 19:39:18.928826] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.0.31rhs/xlator/debug/io-stats.so(io_stats_lookup+0x157) [0x7f2c574ec337] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x7f2c6061f8ad] (-->/usr/lib64/glusterfs/3.4.0.31rhs/xlator/cluster/distribute.so(dht_lookup+0xa4b) [0x7f2c5792f93b]))) 0-replicate-dht: invalid argument: loc->parent
[2013-09-06 19:39:18.929062] W [client-rpc-fops.c:2604:client3_3_lookup_cbk] 0-replicate-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000> (00000000-0000-0000-0000-000000000000)
[2013-09-06 19:39:18.929296] W [client-rpc-fops.c:2604:client3_3_lookup_cbk] 0-replicate-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000> (00000000-0000-0000-0000-000000000000)
[2013-09-06 19:39:18.929326] E [nfs3.c:762:nfs3_getattr_resume] 0-nfs-nfsv3: No such file or directory: (10.70.34.94:924) replicate : 00000000-0000-0000-0000-000000000000
[2013-09-06 19:39:18.929339] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: a44ebd45, GETATTR: NFS: 2(No such file or directory), POSIX: 14(Bad address)
[2013-09-06 19:39:18.929818] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.0.31rhs/xlator/debug/io-stats.so(io_stats_lookup+0x157) [0x7f2c574ec337] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x7f2c6061f8ad] (-->/usr/lib64/glusterfs/3.4.0.31rhs/xlator/cluster/distribute.so(dht_lookup+0xa4b) [0x7f2c5792f93b]))) 0-replicate-dht: invalid argument: loc->parent
[2013-09-06 19:39:18.930059] W [client-rpc-fops.c:2604:client3_3_lookup_cbk] 0-replicate-client-0: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000> (00000000-0000-0000-0000-000000000000)
[2013-09-06 19:39:18.930146] W [client-rpc-fops.c:2604:client3_3_lookup_cbk] 0-replicate-client-1: remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000> (00000000-0000-0000-0000-000000000000)
[2013-09-06 19:39:18.930166] E [nfs3.c:762:nfs3_getattr_resume] 0-nfs-nfsv3: No such file or directory: (10.70.34.94:924) replicate : 00000000-0000-0000-0000-000000000000
[2013-09-06 19:39:18.930177] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: a54ebd45, GETATTR: NFS: 2(No such file or directory), POSIX: 14(Bad address)



Version-Release number of selected component (if applicable):
=============================================================

glusterfs-server-3.4.0.31rhs-1.el6rhs.x86_64


Steps to Reproduce:
===================
1. Created and started 1*2 replicate volume.
2. Mounted on Client FUSE and NFS (both 3.3.0 and 3.4.0 clients)
3. Created file (test_file1) on back-end bricks with different Block Size as

On brick1 (r1): 
dd if=/dev/urandom of=test_file1 bs=1M count=1

On brick2 (r2):
dd if=/dev/urandom of=test_file1 bs=2M count=1

4. Accessed from Fuse and NFS mount's. It is successful.
root@tia [Sep-06-2013-14:58:44] >ls test_file1
test_file1

5. Created another file (test_file2) on back-end bricks with different Block Size as

On brick1 (r1): 
dd if=/dev/urandom of=test_file2 bs=1M count=1

On brick2 (r2):
dd if=/dev/urandom of=test_file2 bs=2M count=1

6. Accessed file test_file2 from Fuse and NFS mount's. It is successful from Fuse but fails from NFS.

From Fuse:

root@wingo [Sep-06-2013-15:46:58] >pwd
/mnt/rep
root@wingo [Sep-06-2013-15:47:01] >mount | grep fuse
rhs-client11:replicate on /mnt/rep type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
root@wingo [Sep-06-2013-15:47:08] >ls test_file2
test_file2
root@wingo [Sep-06-2013-15:47:12] >

From NFS:

root@wingo [Sep-06-2013-15:48:01] >mount | grep n-rep
rhs-client11:replicate on /mnt/n-rep type nfs (rw,vers=3,addr=10.70.36.35)
root@wingo [Sep-06-2013-15:48:03] >pwd
/mnt/n-rep
root@wingo [Sep-06-2013-15:48:06] >ls test_file2
ls: cannot access test_file2: No such file or directory
root@wingo [Sep-06-2013-15:48:09] >



Note:
=====

On one client from NFS mount dropped the cache and tried to access the file it was successful, without dropping the cache waited for an hour on NFS mount it still fails.

Comment 3 Niels de Vos 2015-03-18 08:07:10 UTC
Could you try reducing the complexity of the volume? That should help in identifying if AFR is part of the problem. Please let us know if this problem happens with a volume that consists out of one brick.

Thanks,
Niels

Comment 6 Niels de Vos 2015-11-27 09:22:03 UTC
Because the files were created directly on the bricks, they will not have the gfis extended attribute or the .glusterfs/... gfid-hardlink. Without gfid, NFS access is not possible.

In order to access files that were created directly on the bricks, they need to have at least one LOOKUP call. This is the easiest by executing the 'stat <filename>' command through a fuse-mount.

There have been some attempts to include a LOOKUP inside a READDIR, which would most likely fix this issue too (but on a lower level, DHT?). This is not something we are planning to fix in Gluster/NFS. Access directly on the bricks is not a supported use-case.