Description of problem: posix compliance test fails over nfs mount. Also a removal of residual directories after posix test, results in I/O error. Everything goes off after second time. Version-Release number of selected component (if applicable): glusterfs-server-3.4.0.20rhsquota1-1.el6.x86_64 glusterfs-fuse-3.4.0.20rhsquota1-1.el6.x86_64 glusterfs-3.4.0.20rhsquota1-1.el6.x86_64 How reproducible: always Steps to Reproduce: 1. create a 6x2 volume, start time 2. enable quota 3. set some limit on the root of the volume 4. mount the volume over nfs mount 5. execute posix compliance test Actual results: Test Summary Report ------------------- /opt/qa/tools/pjd-fstest-20080816/tests/mkfifo/00.t (Wstat: 0 Tests: 36 Failed: 5) Failed tests: 30-34 /opt/qa/tools/pjd-fstest-20080816/tests/rename/00.t (Wstat: 0 Tests: 79 Failed: 36) Failed tests: 4-11, 13, 15, 24-31, 33, 35, 41, 43-44 46, 56-65, 68, 79 /opt/qa/tools/pjd-fstest-20080816/tests/rename/01.t (Wstat: 0 Tests: 8 Failed: 2) Failed tests: 2-3 /opt/qa/tools/pjd-fstest-20080816/tests/rename/02.t (Wstat: 0 Tests: 14 Failed: 2) Failed tests: 6-7 /opt/qa/tools/pjd-fstest-20080816/tests/rename/04.t (Wstat: 0 Tests: 18 Failed: 2) Failed tests: 7-8 /opt/qa/tools/pjd-fstest-20080816/tests/rename/05.t (Wstat: 0 Tests: 17 Failed: 7) Failed tests: 7-8, 10, 12, 15-17 /opt/qa/tools/pjd-fstest-20080816/tests/rename/09.t (Wstat: 0 Tests: 56 Failed: 27) Failed tests: 8-9, 11-13, 15-17, 19, 24-25, 27-29, 31-33 35, 40-41, 43-45, 47-49, 51 /opt/qa/tools/pjd-fstest-20080816/tests/rename/10.t (Wstat: 0 Tests: 188 Failed: 99) Failed tests: 9-13, 15-19, 22, 24-28, 30-34, 37, 39-43 45-49, 52, 85-89, 91-95, 98, 100-104, 106-110 113, 115-119, 121-125, 128, 137-141, 143-147 150, 152-156, 158-162, 165, 167-171, 173-177 180 /opt/qa/tools/pjd-fstest-20080816/tests/symlink/00.t (Wstat: 0 Tests: 14 Failed: 2) Failed tests: 11-12 /opt/qa/tools/pjd-fstest-20080816/tests/unlink/00.t (Wstat: 0 Tests: 55 Failed: 1) Failed test: 30 Files=184, Tests=1954, 120 wallclock secs ( 1.31 usr 0.66 sys + 9.52 cusr 17.82 csys = 29.31 CPU) Result: FAIL real 2m0.215s user 0m10.949s sys 0m18.526s from nfs.log, [2013-08-20 02:46:39.425340] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: cba63c3e: /fstest_7dc0ebfc567aa2a4459d3888c1aeb8b1/fstest_4ea878d8c2a8d0b3fc6dcb59f34ebb28/fstest_eb7fd8e4f3835440bdc4e81918d1bbf6 => -1 (Permission denied) [2013-08-20 02:46:44.705541] W [client-rpc-fops.c:188:client3_3_symlink_cbk] 0-dist-rep2-client-2: remote operation failed: Permission denied. Path: (/fstest_fc8910c829a60ebafe66f83ae8d9ce58/fstest_d686c8cf0f16e61be851520bde685688/fstest_4d12ef10fa50f6ba3651746baefd246e to test) [2013-08-20 02:46:44.705745] W [client-rpc-fops.c:188:client3_3_symlink_cbk] 0-dist-rep2-client-3: remote operation failed: Permission denied. Path: (/fstest_fc8910c829a60ebafe66f83ae8d9ce58/fstest_d686c8cf0f16e61be851520bde685688/fstest_4d12ef10fa50f6ba3651746baefd246e to test) [2013-08-20 02:46:44.705798] W [nfs3.c:2890:nfs3svc_symlink_cbk] 0-nfs: 3a83c3e: /fstest_fc8910c829a60ebafe66f83ae8d9ce58/fstest_d686c8cf0f16e61be851520bde685688/fstest_4d12ef10fa50f6ba3651746baefd246e => -1 (Permission denied) [2013-08-20 02:46:55.248185] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-0: remote operation failed: Permission denied [2013-08-20 02:46:55.248353] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-1: remote operation failed: Permission denied [2013-08-20 02:46:55.248404] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: 76a93c3e: /fstest_545a1b2e3ad372daeffe83c58bf3c881/fstest_a0695b4cd7a9dbbd2adca88c361326df => -1 (Permission denied) [2013-08-20 02:46:56.321346] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-0: remote operation failed: Permission denied [2013-08-20 02:46:56.322133] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-1: remote operation failed: Permission denied [2013-08-20 02:46:56.322244] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: 7ea93c3e: /fstest_545a1b2e3ad372daeffe83c58bf3c881/fstest_a0695b4cd7a9dbbd2adca88c361326df => -1 (Permission denied) [2013-08-20 02:47:03.955455] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-4: remote operation failed: Permission denied [2013-08-20 02:47:03.956172] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-5: remote operation failed: Permission denied [2013-08-20 02:47:03.956269] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: 92aa3c3e: /fstest_ff55ce6f14c8b423f5623303680d6063/fstest_0f89762c394b384dab99fea000e6db49/fstest_788c59aff3c29638badf322aba88fb7c => -1 (Permission denied) [2013-08-20 02:48:51.246415] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-8: remote operation failed: No data available [2013-08-20 02:48:51.248257] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dist-rep2-client-9: remote operation failed: No data available [2013-08-20 02:48:51.248472] W [nfs3.c:3342:nfs3svc_remove_cbk] 0-nfs: f4aa3c3e: /fstest_21c2009cadd63e20b08f277629702ab8/fstest_3983234f590c77f1f508de35bd3686f8/fstest_9e2a445a2b506cc3c9a95cfddca039b8 => -1 (No data available) [2013-08-20 02:48:51.248558] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: f4aa3c3e, REMOVE: NFS: 10006(Error occurred on the server or IO Error), POSIX: 61(No data available) Expected results: posix compliace test is suppose to pass. Additional info:
Created attachment 788307 [details] nfs logs
Are the brick machines in sync with the time? /opt/qa/tools/pjd-fstest-20080816/tests/mkfifo/00.t (Wstat: 0 Tests: 36 Failed: 5) Failed tests: 30-34 /opt/qa/tools/pjd-fstest-20080816/tests/symlink/00.t (Wstat: 0 Tests: 14 Failed: 2) Failed tests: 11-12 The above failures might be because of out of sync time between the brick machines (it happens even without quota and on fuse mount also). For rename and unlink related issues, a patch has been sent for the review (http://review.gluster.org/#/c/5668/). Also on nfs some posix compliance tests are known to failure. So please make sure, the failed tests are not just them.
https://code.engineering.redhat.com/gerrit/11621 handles the issue.
https://code.engineering.redhat.com/gerrit/12179 rhs-2.1 patch fixes the issue.
"Fixed in version" please.
By running posix-complaince tests i could see some extra test failures with quota. prove -r /opt/qa/tools/posix-testsuite/tests/ without quota ------- Test Summary Report ------------------- /opt/qa/tools/posix-testsuite/tests/chown/00.t (Wstat: 0 Tests: 171 Failed: 3) Failed tests: 77, 84, 88 /opt/qa/tools/posix-testsuite/tests/rename/05.t (Wstat: 0 Tests: 17 Failed: 5) Failed tests: 10, 12, 14, 16-17 Files=185, Tests=1962, 92 wallclock secs ( 1.19 usr 0.44 sys + 10.40 cusr 10.63 csys = 22.66 CPU) Result: FAIL with quota ------------- Test Summary Report ------------------- /opt/qa/tools/posix-testsuite/tests/chown/00.t (Wstat: 0 Tests: 171 Failed: 3) Failed tests: 77, 84, 88 /opt/qa/tools/posix-testsuite/tests/rename/00.t (Wstat: 0 Tests: 79 Failed: 4) Failed tests: 64-65, 68, 79 /opt/qa/tools/posix-testsuite/tests/rename/05.t (Wstat: 0 Tests: 17 Failed: 5) Failed tests: 10, 12, 14, 16-17 Files=185, Tests=1962, 93 wallclock secs ( 1.19 usr 0.44 sys + 10.50 cusr 10.67 csys = 22.80 CPU) Result: FAIL This test was run on NFS mount NFS logs -------- [2013-10-06 13:28:23.305071] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-2: remote operation failed: Permission denied [2013-10-06 13:28:23.305183] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-3: remote operation failed: Permission denied [2013-10-06 13:28:23.305228] W [dht-rename.c:365:dht_rename_unlink_cbk] 0-dr-dht: /fstest_d9762303927458ebbc17aef634371c88/fstest_eb61a05a4a) [2013-10-06 13:28:25.388261] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-4: remote operation failed: Permission denied [2013-10-06 13:28:25.388349] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-5: remote operation failed: Permission denied [2013-10-06 13:28:25.388386] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: a85a9972: rename /fstest_d9762303927458ebbc17aef634371c88/fstest_eb61) [2013-10-06 13:28:26.429227] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-4: remote operation failed: Permission denied [2013-10-06 13:28:26.429306] W [client-rpc-fops.c:638:client3_3_unlink_cbk] 0-dr-client-5: remote operation failed: Permission denied [2013-10-06 13:28:26.429341] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: b15a9972: rename /fstest_d9762303927458ebbc17aef634371c88/fstest_eb61) [2013-10-06 13:28:26.458371] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: b75a9972: /fstest_d9762303927458ebbc17aef634371c88 => -1 (Directory no) [2013-10-06 13:28:28.162879] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: f75b9972: /fstest_0880cd50fc83aaafc68da6e61f8edd80/fstest_3ed6b6a07947) [2013-10-06 13:28:28.174497] W [nfs3.c:3508:nfs3svc_rmdir_cbk] 0-nfs: fa5b9972: /fstest_0880cd50fc83aaafc68da6e61f8edd80 => -1 (Directory no) [2013-10-06 13:28:32.489394] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: fe5d9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_) [2013-10-06 13:28:32.489471] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: fe5d9972, RENAME: NFS: 66(Directory not empty), P) [2013-10-06 13:28:32.524794] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: 85e9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_b) [2013-10-06 13:28:32.524852] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: 85e9972, RENAME: NFS: 66(Directory not empty), PO) [2013-10-06 13:28:32.563173] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: 125e9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_) [2013-10-06 13:28:32.563247] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: 125e9972, RENAME: NFS: 66(Directory not empty), P) [2013-10-06 13:28:32.594655] W [nfs3.c:3663:nfs3svc_rename_cbk] 0-nfs: 1b5e9972: rename /fstest_a5ac720430f17a8c6af3a085567f7439 -> /fstest_) [2013-10-06 13:28:32.594712] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1b5e9972, RENAME: NFS: 66(Directory not empty), P)
This failure is because, when rename happens, distribute splits the call in to separate calls (mknod of the linkfile, link new name, unlink oldname). So the rename call as a whole does not go through the posix acl xlator thus preventing some of the checks for rename from happening. I think this fail happens only on nfs client. (Can you pleas check if it happens on fuse client?) Its because when tests are done on the fuse mount point, before the rename call comes to glusterfs, the acl check is done by the kernel itself and appropriate error is sent to the application (if the rename call is not allowed). Unfortunately for nfs client the kernel does not do acl checks and sends the call to glusterfs which when comes to distribute xlator gets split into multiple calls (as explained in above paragraph) thus preventing actual acl check for the rename. Loading a posix-acl xlator below nfs xlator in the nfs-server volfile solves the issue (But I am not sure whether loading the posix-acl xlator below nfs xlator is a good idea).
Moving the known issues to Doc team, to be documented in release notes for U1
I've documented this known issue in the BB U1 Release Notes. Here is the link: http://documentation-devel.engineering.redhat.com/docs/en-US/Red_Hat_Storage/2.1/html/2.1_Update_1_Release_Notes/chap-Documentation-2.1_Update_1_Release_Notes-Known_Issues.html
Patch submitted upstream: http://review.gluster.org/#/c/9419/
The above mentioned test cases are passing in 3.7 on a fuse mount. So, closing the bug.