Description of problem: Connectathon i.e. Cthon basic test i.e. link and rename test fails intermittently with below error Error: ./test7: link and rename ./test7: (/mnt/nfs-test/rhsauto056.test) file.0 exists after rename to newfile.0 basic tests failed Tests failed, leaving /mnt/nfs-test mounted Version-Release number of selected component (if applicable): glusterfs.x86_64 0:3.4.0.69rhs-1.el6rhs How reproducible: Intermittent Steps to Reproduce: 1. Create 6X2 volume 2. Enbale quota on it 3. Start the volume 4. On client side run Cthon (the test failed intermittently while using RHEL 6.6 and RHEL 7.0 as client) command: server -b -o vers=3 -p <Volume> -m /mnt/nfs-test <RHS Node> Actual results: Starting BASIC tests: test directory /mnt/nfs-test/rhsauto056.test (arg: -t) ./test1: File and directory creation test created 155 files 62 directories 5 levels deep in 1.32 seconds ./test1 ok. ./test2: File and directory removal test removed 155 files 62 directories 5 levels deep in 1.8 seconds ./test2 ok. ./test3: lookups across mount point 500 getcwd and stat calls in 0.0 seconds ./test3 ok. ./test4: setattr, getattr, and lookup 1000 chmods and stats on 10 files in 1.33 seconds ./test4 ok. ./test5: read and write wrote 1048576 byte file 10 times in 2.3 seconds (5146387 bytes/sec) read 1048576 byte file 10 times in 0.1 seconds (891418855 bytes/sec) ./test5 ok. ./test6: readdir 20500 entries read, 200 files in 5.14 seconds ./test6 ok. ./test7: link and rename ./test7: (/mnt/nfs-test/rhsauto056.test) file.0 exists after rename to newfile.0 basic tests failed Tests failed, leaving /mnt/nfs-test mounted Expected results: The test should not fail Additional info: Beaker Job: https://beaker.engineering.redhat.com/jobs/781261
Cthon failure on client side (grep for fail) http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2014/10/7812/781261/1628404/25400270/TESTOUT.log NSF log from the server side: http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2014/10/7812/781261/1628401/25400231/nfs.log
I have never seen this before i.e. previous releases. hence marking this as regression.
I was unable to reproduce this with the cthon/test7 running 1000x in a loop on a 6x2 distribute-replicate volume mounted over nfs (2 servers running glusterfs-server-3.4.0.69rhs-1.el6rhs.x86_64, NFS-client running RHEL-6.5). Now, after enabling quota (without setting any limits), I actually hit the issue immediately. Even on a new 6x2 volume, the issue happens in the 1st run of test7.
This also reproduces with a volume that consists out of one brick. That should be easier to debug :)
We will hit this issue with the below test-case as well. I will look at the test-case mentioned in the description to check if it does something similar to the test-case below: 1) Create volume and enable quota with 1GB usage limit 2) Create a file with 600MB 3) rename this file. The command is successful but rename fails with quota exceeded. Patch http://review.gluster.org/#/c/8940/ fixes the issue if ancestor (with has limit set) of src and dest files are same. I think problem here is even though rename failed at back-end we are returning success to the client. I will do more investigation on this and update the bug again.
The issue is happening in latest 2.1 code, but works in latest upstream code. I am investigating it in 2.1 code and will update the bug once I find something.
unlink on a file which has more than 1 link failed when quota is enabled, hence the problem.
Patch submitted: https://code.engineering.redhat.com/gerrit/35825
Executed the ctho test suite while quota was enabled and the limits being set. Issue has not happend on the build, glusterfs-3.4.0.70rhs-1.el6rhs.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2014-1853.html