Bug 998791

Summary: quota: posix compliace test fails
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Saurabh <saujain>
Component: quotaAssignee: Vijaikumar Mallikarjuna <vmallika>
Status: CLOSED CURRENTRELEASE QA Contact: Saurabh <saujain>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: asriram, gluster-bugs, grajaiya, mhideo, mselvaga, mzywusko, rabhat, rhs-bugs, rwheeler, sdharane, shmohan, smohan, storage-doc, storage-qa-internal, vagarwal, vbellur, vmallika
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
During a file rename operation, if the hashing logic moves the target file to a different brick, then the rename operation fails if it is initiated by a non-root user.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-26 08:30:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1020127    
Attachments:
Description Flags
nfs logs none

Description Saurabh 2013-08-20 05:38:42 UTC
Description of problem:
posix compliance test fails over nfs mount.
Also a removal of residual directories after posix test, results in I/O error.
Everything goes off after second time.

Version-Release number of selected component (if applicable):
glusterfs-server-3.4.0.20rhsquota1-1.el6.x86_64
glusterfs-fuse-3.4.0.20rhsquota1-1.el6.x86_64
glusterfs-3.4.0.20rhsquota1-1.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. create a 6x2 volume, start time
2. enable quota
3. set some limit on the root of the volume
4. mount the volume over nfs mount
5. execute posix compliance test

Actual results:
Test Summary Report
-------------------
/opt/qa/tools/pjd-fstest-20080816/tests/mkfifo/00.t  (Wstat: 0 Tests: 36 Failed: 5)
  Failed tests:  30-34
/opt/qa/tools/pjd-fstest-20080816/tests/rename/00.t  (Wstat: 0 Tests: 79 Failed: 36)
  Failed tests:  4-11, 13, 15, 24-31, 33, 35, 41, 43-44
                46, 56-65, 68, 79
/opt/qa/tools/pjd-fstest-20080816/tests/rename/01.t  (Wstat: 0 Tests: 8 Failed: 2)
  Failed tests:  2-3
/opt/qa/tools/pjd-fstest-20080816/tests/rename/02.t  (Wstat: 0 Tests: 14 Failed: 2)
  Failed tests:  6-7
/opt/qa/tools/pjd-fstest-20080816/tests/rename/04.t  (Wstat: 0 Tests: 18 Failed: 2)
  Failed tests:  7-8
/opt/qa/tools/pjd-fstest-20080816/tests/rename/05.t  (Wstat: 0 Tests: 17 Failed: 7)
  Failed tests:  7-8, 10, 12, 15-17
/opt/qa/tools/pjd-fstest-20080816/tests/rename/09.t  (Wstat: 0 Tests: 56 Failed: 27)
  Failed tests:  8-9, 11-13, 15-17, 19, 24-25, 27-29, 31-33
                35, 40-41, 43-45, 47-49, 51
/opt/qa/tools/pjd-fstest-20080816/tests/rename/10.t  (Wstat: 0 Tests: 188 Failed: 99)
  Failed tests:  9-13, 15-19, 22, 24-28, 30-34, 37, 39-43
                45-49, 52, 85-89, 91-95, 98, 100-104, 106-110
                113, 115-119, 121-125, 128, 137-141, 143-147
                150, 152-156, 158-162, 165, 167-171, 173-177
                180
/opt/qa/tools/pjd-fstest-20080816/tests/symlink/00.t (Wstat: 0 Tests: 14 Failed: 2)
  Failed tests:  11-12
/opt/qa/tools/pjd-fstest-20080816/tests/unlink/00.t  (Wstat: 0 Tests: 55 Failed: 1)
  Failed test:  30
Files=184, Tests=1954, 120 wallclock secs ( 1.31 usr  0.66 sys +  9.52 cusr 17.82 csys = 29.31 CPU)
Result: FAIL

real	2m0.215s
user	0m10.949s
sys	0m18.526s

from nfs.log,
[2013-08-20 02:46:39.425340] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: cba63c3e: /fstest_7dc0ebfc567aa2a4459d3888c1aeb8b1/fstest_4ea878d8c2a8d0b3fc6dcb59f34ebb28/fstest_eb7fd8e4f3835440bdc4e81918d1bbf6 => -1 (Permission denied)
[2013-08-20 02:46:44.705541] W [client-rpc-fops.c:188:client3_3_symlink_cbk] 0-dist-rep2-client-2: remote operation failed: Permission denied. Path: (/fstest_fc8910c829a60ebafe66f83ae8d9ce58/fstest_d686c8cf0f16e61be851520bde685688/fstest_4d12ef10fa50f6ba3651746baefd246e to test)
[2013-08-20 02:46:44.705745] W [client-rpc-fops.c:188:client3_3_symlink_cbk] 0-dist-rep2-client-3: remote operation failed: Permission denied. Path: (/fstest_fc8910c829a60ebafe66f83ae8d9ce58/fstest_d686c8cf0f16e61be851520bde685688/fstest_4d12ef10fa50f6ba3651746baefd246e to test)
[2013-08-20 02:46:44.705798] W [nfs3.c:2890:nfs3svc_symlink_cbk] 0-nfs: 3a83c3e: /fstest_fc8910c829a60ebafe66f83ae8d9ce58/fstest_d686c8cf0f16e61be851520bde685688/fstest_4d12ef10fa50f6ba3651746baefd246e => -1 (Permission denied)
[2013-08-20 02:46:55.248185] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-0: remote operation failed: Permission denied
[2013-08-20 02:46:55.248353] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-1: remote operation failed: Permission denied
[2013-08-20 02:46:55.248404] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: 76a93c3e: /fstest_545a1b2e3ad372daeffe83c58bf3c881/fstest_a0695b4cd7a9dbbd2adca88c361326df => -1 (Permission denied)
[2013-08-20 02:46:56.321346] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-0: remote operation failed: Permission denied
[2013-08-20 02:46:56.322133] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-1: remote operation failed: Permission denied
[2013-08-20 02:46:56.322244] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: 7ea93c3e: /fstest_545a1b2e3ad372daeffe83c58bf3c881/fstest_a0695b4cd7a9dbbd2adca88c361326df => -1 (Permission denied)
[2013-08-20 02:47:03.955455] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-4: remote operation failed: Permission denied
[2013-08-20 02:47:03.956172] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-5: remote operation failed: Permission denied
[2013-08-20 02:47:03.956269] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: 92aa3c3e: /fstest_ff55ce6f14c8b423f5623303680d6063/fstest_0f89762c394b384dab99fea000e6db49/fstest_788c59aff3c29638badf322aba88fb7c => -1 (Permission denied)
[2013-08-20 02:48:51.246415] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-8: remote operation failed: No data available
[2013-08-20 02:48:51.248257] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-9: remote operation failed: No data available
[2013-08-20 02:48:51.248472] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: f4aa3c3e: /fstest_21c2009cadd63e20b08f277629702ab8/fstest_3983234f590c77f1f508de35bd3686f8/fstest_9e2a445a2b506cc3c9a95cfddca039b8 => -1 (No data available)
[2013-08-20 02:48:51.248558] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: f4aa3c3e, REMOVE: NFS: 10006(Error occurred on the server or IO Error), POSIX: 61(No data available)



Expected results:
posix compliace test is suppose to pass.

Additional info:

Comment 1 Saurabh 2013-08-20 05:39:56 UTC
Created attachment 788307 [details]
nfs logs

Comment 3 Raghavendra Bhat 2013-08-20 13:22:47 UTC
Are the brick machines in sync with the time? 

/opt/qa/tools/pjd-fstest-20080816/tests/mkfifo/00.t  (Wstat: 0 Tests: 36 Failed: 5)
  Failed tests:  30-34
/opt/qa/tools/pjd-fstest-20080816/tests/symlink/00.t (Wstat: 0 Tests: 14 Failed: 2)
  Failed tests:  11-12

The above failures might be because of out of sync time between the brick machines (it happens even without quota and on fuse mount also).

For rename and unlink related issues, a patch has been sent for the review (http://review.gluster.org/#/c/5668/).

Also on nfs some posix compliance tests are known to failure. So please make sure, the failed tests are not just them.

Comment 4 Raghavendra Bhat 2013-09-23 10:26:29 UTC
https://code.engineering.redhat.com/gerrit/11621 handles the issue.

Comment 5 Raghavendra Bhat 2013-09-24 09:36:25 UTC
https://code.engineering.redhat.com/gerrit/12179 rhs-2.1 patch fixes the issue.

Comment 6 Gowrishankar Rajaiyan 2013-09-24 11:23:13 UTC
"Fixed in version" please.

Comment 7 shylesh 2013-10-06 13:32:07 UTC
By running posix-complaince tests i could see some extra test failures with quota.

prove -r /opt/qa/tools/posix-testsuite/tests/


without quota
-------
Test Summary Report
-------------------
/opt/qa/tools/posix-testsuite/tests/chown/00.t   (Wstat: 0 Tests: 171 Failed: 3)
  Failed tests:  77, 84, 88
/opt/qa/tools/posix-testsuite/tests/rename/05.t  (Wstat: 0 Tests: 17 Failed: 5)
  Failed tests:  10, 12, 14, 16-17
Files=185, Tests=1962, 92 wallclock secs ( 1.19 usr  0.44 sys + 10.40 cusr 10.63 csys = 22.66 CPU)
Result: FAIL



with quota
-------------
Test Summary Report
-------------------
/opt/qa/tools/posix-testsuite/tests/chown/00.t   (Wstat: 0 Tests: 171 Failed: 3)
  Failed tests:  77, 84, 88
/opt/qa/tools/posix-testsuite/tests/rename/00.t  (Wstat: 0 Tests: 79 Failed: 4)
  Failed tests:  64-65, 68, 79
/opt/qa/tools/posix-testsuite/tests/rename/05.t  (Wstat: 0 Tests: 17 Failed: 5)
  Failed tests:  10, 12, 14, 16-17
Files=185, Tests=1962, 93 wallclock secs ( 1.19 usr  0.44 sys + 10.50 cusr 10.67 csys = 22.80 CPU)
Result: FAIL


This test was run on NFS mount
NFS logs
--------

[2013-10-06 13:28:23.305071] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-2: remote operation failed: Permission denied
[2013-10-06 13:28:23.305183] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-3: remote operation failed: Permission denied
[2013-10-06 13:28:23.305228] W [dht-rename.c:365:dht_rename_unlink_cbk] 0-dr-dht: /fstest_d9762303927458ebbc17aef634371c88/fstest_eb61a05a4a)
[2013-10-06 13:28:25.388261] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-4: remote operation failed: Permission denied
[2013-10-06 13:28:25.388349] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-5: remote operation failed: Permission denied
[2013-10-06 13:28:25.388386] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: a85a9972: rename /fstest_d9762303927458ebbc17aef634371c88/fstest_eb61)
[2013-10-06 13:28:26.429227] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-4: remote operation failed: Permission denied
[2013-10-06 13:28:26.429306] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-5: remote operation failed: Permission denied
[2013-10-06 13:28:26.429341] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: b15a9972: rename /fstest_d9762303927458ebbc17aef634371c88/fstest_eb61)
[2013-10-06 13:28:26.458371] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: b75a9972: /fstest_d9762303927458ebbc17aef634371c88 => -1 (Directory no)
[2013-10-06 13:28:28.162879] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: f75b9972: /fstest_0880cd50fc83aaafc68da6e61f8edd80/fstest_3ed6b6a07947)
[2013-10-06 13:28:28.174497] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: fa5b9972: /fstest_0880cd50fc83aaafc68da6e61f8edd80 => -1 (Directory no)
[2013-10-06 13:28:32.489394] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: fe5d9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_)
[2013-10-06 13:28:32.489471] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: fe5d9972, RENAME: NFS: 66(Directory not empty), P)
[2013-10-06 13:28:32.524794] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: 85e9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_b)
[2013-10-06 13:28:32.524852] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: 85e9972, RENAME: NFS: 66(Directory not empty), PO)
[2013-10-06 13:28:32.563173] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: 125e9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_)
[2013-10-06 13:28:32.563247] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: 125e9972, RENAME: NFS: 66(Directory not empty), P)
[2013-10-06 13:28:32.594655] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: 1b5e9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_)
[2013-10-06 13:28:32.594712] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1b5e9972, RENAME: NFS: 66(Directory not empty), P)

Comment 8 Raghavendra Bhat 2013-10-07 09:34:45 UTC
This failure is because, when rename happens, distribute splits the call in to separate calls (mknod of the linkfile, link new name, unlink oldname). So the rename call as a whole does not go through the posix acl xlator thus preventing some of the checks for rename from happening.

I think this fail happens only on nfs client. (Can you pleas check if it happens on fuse client?) Its because when tests are done on the fuse mount point, before the rename call comes to glusterfs, the acl check is done by the kernel itself and appropriate error is sent to the application (if the rename call is not allowed). Unfortunately for nfs client the kernel does not do acl checks and sends the call to glusterfs which when comes to distribute xlator gets split into multiple calls (as explained in above paragraph) thus preventing actual acl check for the rename.

Loading a posix-acl xlator below nfs xlator in the nfs-server volfile solves the issue (But I am not sure whether loading the posix-acl xlator below nfs xlator is a good idea).

Comment 9 Vivek Agarwal 2013-11-14 11:26:18 UTC
Moving the known issues to Doc team, to be documented in release notes for U1

Comment 10 Vivek Agarwal 2013-11-14 11:28:30 UTC
Moving the known issues to Doc team, to be documented in release notes for U1

Comment 11 Vivek Agarwal 2013-11-14 11:29:02 UTC
Moving the known issues to Doc team, to be documented in release notes for U1

Comment 14 Vijaikumar Mallikarjuna 2015-01-12 09:21:48 UTC
Patch submitted upstream: http://review.gluster.org/#/c/9419/

Comment 15 Manikandan 2015-10-26 08:30:32 UTC
The above mentioned test cases are passing in 3.7 on a fuse mount. So, closing the bug.