+++ This bug was initially created as a clone of Bug #1413584 +++ Description of problem: ======================= On a distributed-replicate volume, Bonnie test suite failed with "Can't open file" error. ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /mnt/nfs/run15370 executing bonnie Using uid:0, gid:0. Writing with putc()... done Can't open file ./Bonnie.15391.005 <---------------------------- real 5m22.818s user 4m22.679s sys 0m47.933s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory Removing /mnt/nfs//run15370/ rmdir: failed to remove ‘/mnt/nfs//run15370/’: Directory not empty rmdir failed:Directory not empty Version-Release number of selected component (if applicable): 3.8.4-11.el7rhgs.x86_64 How reproducible: ================ 1/1 Steps to Reproduce: =================== 1) Create a distributed-replicate volume and start it (Please see the vol info output for enabled volume settings). 2) Mount it on multiples clients via gNFS. 3) Start bonnie test suite from one client and from the remaining clients start a infinite loop of lookups. Bonnie test suite failed with "Can't open file" error and I can see the below errors in nfs logs. [2017-01-16 10:52:08.042767] W [MSGID: 101159] [inode.c:1214:__inode_unlink] 0-inode: 09388be3-eb94-4cb1-94b0-0518e00d3c90/Bonnie.15391.001: dentry not found in a32137f8-6941-4db4-b723-10019fd4f359 [2017-01-16 10:52:08.386420] W [MSGID: 101159] [inode.c:1214:__inode_unlink] 0-inode: 09388be3-eb94-4cb1-94b0-0518e00d3c90/Bonnie.15391.003: dentry not found in 3c05ba77-5bad-4da8-8d5c-135fa9366d5f [2017-01-16 10:52:08.956338] W [MSGID: 101159] [inode.c:1214:__inode_unlink] 0-inode: 09388be3-eb94-4cb1-94b0-0518e00d3c90/Bonnie.15391.004: dentry not found in 670abac1-f5d3-42b2-b0b2-4900c1915a9a [2017-01-16 10:52:09.067140] W [MSGID: 101159] [inode.c:1214:__inode_unlink] 0-inode: 09388be3-eb94-4cb1-94b0-0518e00d3c90/Bonnie.15391.005: dentry not found in 3f678a33-317b-4cde-a959-0e038cdc04ca [2017-01-16 10:52:09.125807] E [client-common.c:844:client_pre_inodelk] (-->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0xc740) [0x7f866b734740] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x28a70) [0x7f866b750a70] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x3aa66) [0x7f866b762a66] ) 0-: Assertion failed: 0 [2017-01-16 10:52:09.126007] E [client-common.c:844:client_pre_inodelk] (-->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0xc740) [0x7f866b734740] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x28a70) [0x7f866b750a70] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x3aa66) [0x7f866b762a66] ) 0-: Assertion failed: 0 [2017-01-16 10:52:09.126267] E [client-common.c:844:client_pre_inodelk] (-->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0xc740) [0x7f866b734740] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x28a70) [0x7f866b750a70] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x3aa66) [0x7f866b762a66] ) 0-: Assertion failed: 0 [2017-01-16 10:52:09.126465] E [client-common.c:844:client_pre_inodelk] (-->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0xc740) [0x7f866b734740] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x28a70) [0x7f866b750a70] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x3aa66) [0x7f866b762a66] ) 0-: Assertion failed: 0 [2017-01-16 10:52:09.126533] W [MSGID: 108019] [afr-lk-common.c:1090:is_blocking_locks_count_sufficient] 0-newdr-replicate-4: Unable to obtain blocking inode lock on even one child for gfid:00000000-0000-0000-0000-000000000000. [2017-01-16 10:52:09.126614] I [MSGID: 108019] [afr-transaction.c:1829:afr_post_blocking_inodelk_cbk] 0-newdr-replicate-4: Blocking inodelks failed. [2017-01-16 10:52:09.126758] W [MSGID: 112199] [nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3: /run15370/Bonnie.15391.005 => (XID: 8f57191d, CREATE: NFS: 22(Invalid argument for operation), POSIX: 22(Invalid argument)), FH: exportid 9e098339-67c4-4975-bdfa-7d5278a3aae8, gfid 61b4583d-b71a-4dfc-92ad-cb4dd7d0fe53, mountid 0e315778-0000-0000-0000-000000000000 [2017-01-16 10:52:11.821905] W [MSGID: 112032] [nfs3.c:3600:nfs3svc_rmdir_cbk] 0-nfs: a557191d: /run15370 => -1 (Directory not empty) [Directory not empty] Actual results: =============== Bonnie test suite failed with "Can't open file" error. Expected results: ================ Bonnie test suite should complete without any errors/issues. --- Additional comment from Jiffin on 2017-01-17 00:33:15 EST --- The below log message filled only because of gfid for that location(/run15370/Bonnie.15391.005) is NULL [2017-01-16 10:52:09.126007] E [client-common.c:844:client_pre_inodelk] (-->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0xc740) [0x7f866b734740] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x28a70) [0x7f866b750a70] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so(+0x3aa66) [0x7f866b762a66] ) 0-: Assertion failed: 0 Code snippet from client_pre_inodelk ---- if (!(loc && loc->inode)) goto out; if (!gf_uuid_is_null (loc->gfid)) memcpy (req->gfid, loc->gfid, 16); else memcpy (req->gfid, loc->inode->gfid, 16); GF_ASSERT_AND_GOTO_WITH_ERROR (this->name, !gf_uuid_is_null (*((uuid_t *)req->gfid)), out, op_errno, EINVAL); ---- The condition for returning EINVAL is only when gfid is NULL and this information passed to upper layer which resulted in this failure. [2017-01-16 10:52:09.126533] W [MSGID: 108019] [afr-lk-common.c:1090:is_blocking_locks_count_sufficient] 0-newdr-replicate-4: Unable to obtain blocking inode lock on even one child for gfid:00000000-0000-0000-0000-000000000000 [2017-01-16 10:52:09.126758] W [MSGID: 112199] [nfs3- elpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3: /run15370/Bonnie.15391.005 => (XID: 8f57191d, CREATE: NFS: 22(Invalid argument for operation), POSIX: 22(Invalid argument)), FH: exportid 9e098339-67c4-4975-bdfa-7d5278a3aae8, gfid 61b4583d-b71a-4dfc-92ad-cb4dd7d0fe53, mountid 0e315778-0000-0000-0000-000000000000 Actually at the backend the file is already created with following details getfattr -d -m "." -e hex Bonnie.15391.005 # file: Bonnie.15391.005 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.gfid=0x61b4583db71a4dfc92adcb4dd7d0fe53 trusted.pgfid.09388be3-eb94-4cb1-94b0-0518e00d3c90=0x00000001 Either this information is not updated correctly at nfs-server(gluserfs client) If you look the logs just before it it failed to remove same file [2017-01-16 10:52:09.067140] W [MSGID: 101159] [inode.c:1214:__inode_unlink] 0-inode: 09388be3-eb94-4cb1-94b0-0518e00d3c90/Bonnie.15391.005: dentry not found in 3f678a33-317b-4cde-a959-0e038cdc04ca The gfid mentioned in this operation is "3f678a33-317b-4cde-a959-0e038cdc04ca" different from the backend. So the following might have happened : Bonnie initial creates a filed named Bonnie.15391.005 and removed it. But this information is not properly updated in the client stack. So Bonnie creates the same file it ends up using invalid gfid. Thanks ravi for help in debugging. --- Additional comment from Jiffin on 2017-01-17 07:14:02 EST --- RCA : By code checking it seems issue resides in nfs3svc_create_cbk(), it calls nfs_setattr() with improper gfid which results in this failure. Thanks Pranith for finding the issue and proposing the fix
REVIEW: http://review.gluster.org/16421 (gNFS : pass proper gfid for nfs_setattr() in nfs3svc_create_cbk()) posted (#1) for review on master by jiffin tony Thottan (jthottan)
COMMIT: https://review.gluster.org/16421 committed in master by Niels de Vos (ndevos) ------ commit fbe70dcdd7c62ae97e04e09755dcf87d90fcad8b Author: Jiffin Tony Thottan <jthottan> Date: Tue Jan 17 18:26:55 2017 +0530 gNFS : pass proper gfid for nfs_setattr() in nfs3svc_create_cbk() The nfs3svc_create_cbk() takes "gfid" from inode variable which it received, not from inode which it performed linking to nfs_setattr(). There is possiblity that the inode passed into this function holds NULL gfid if there are some parallel operation happening on the same file. Similar issue is mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1413971 Thanks pranith for proposing the fix Change-Id: I1a0ff4f02b483416f19f4f064c306c2cad5d9d8b BUG: 1413971 Signed-off-by: Jiffin Tony Thottan <jthottan> Reviewed-on: https://review.gluster.org/16421 Reviewed-by: soumya k <skoduri> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report. glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html [2] https://www.gluster.org/pipermail/gluster-users/