Bug 1157705

Summary: BVT: Connectathon i.e. Cthon basic test fails over NFS when quota is enabled
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Lalatendu Mohanty <lmohanty>
Component: quotaAssignee: Vijaikumar Mallikarjuna <vmallika>
Status: CLOSED ERRATA QA Contact: Saurabh <saujain>
Severity: urgent Docs Contact:
Priority: high    
Version: 2.1CC: asrivast, fharshav, mzywusko, ndevos, rcyriac, rgowdapp, rhs-bugs, smohan, ssamanta, storage-qa-internal, vagarwal, vmallika
Target Milestone: ---Keywords: Regression, ZStream
Target Release: RHGS 2.1.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.70rhs-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-11-13 12:23:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1147095, 1158435    

Description Lalatendu Mohanty 2014-10-27 14:26:53 UTC
Description of problem:

Connectathon i.e. Cthon basic test i.e. link and rename test fails intermittently with below error

Error:

./test7: link and rename
	./test7: (/mnt/nfs-test/rhsauto056.test) file.0 exists after rename to newfile.0
basic tests failed
Tests failed, leaving /mnt/nfs-test mounted

Version-Release number of selected component (if applicable):
glusterfs.x86_64 0:3.4.0.69rhs-1.el6rhs

How reproducible:
Intermittent

Steps to Reproduce:
1. Create 6X2 volume
2. Enbale quota on it
3. Start the volume 
4. On client side run Cthon (the test failed intermittently while using RHEL 6.6 and RHEL 7.0 as client)

command: server -b -o vers=3 -p <Volume> -m /mnt/nfs-test <RHS Node>

Actual results:


Starting BASIC tests: test directory /mnt/nfs-test/rhsauto056.test (arg: -t)

./test1: File and directory creation test
	created 155 files 62 directories 5 levels deep in 1.32 seconds
	./test1 ok.

./test2: File and directory removal test
	removed 155 files 62 directories 5 levels deep in 1.8  seconds
	./test2 ok.

./test3: lookups across mount point
	500 getcwd and stat calls in 0.0  seconds
	./test3 ok.

./test4: setattr, getattr, and lookup
	1000 chmods and stats on 10 files in 1.33 seconds
	./test4 ok.

./test5: read and write
	wrote 1048576 byte file 10 times in 2.3  seconds (5146387 bytes/sec)
	read 1048576 byte file 10 times in 0.1  seconds (891418855 bytes/sec)
	./test5 ok.

./test6: readdir
	20500 entries read, 200 files in 5.14 seconds
	./test6 ok.

./test7: link and rename
	./test7: (/mnt/nfs-test/rhsauto056.test) file.0 exists after rename to newfile.0
basic tests failed
Tests failed, leaving /mnt/nfs-test mounted


Expected results:

The test should not fail

Additional info:

Beaker Job: https://beaker.engineering.redhat.com/jobs/781261

Comment 2 Lalatendu Mohanty 2014-10-27 14:43:31 UTC
I have never seen this before i.e. previous releases. hence marking this as regression.

Comment 3 Niels de Vos 2014-10-29 11:04:00 UTC
I was unable to reproduce this with the cthon/test7 running 1000x in a loop on a 6x2 distribute-replicate volume mounted over nfs (2 servers running glusterfs-server-3.4.0.69rhs-1.el6rhs.x86_64, NFS-client running RHEL-6.5).

Now, after enabling quota (without setting any limits), I actually hit the issue immediately. Even on a new 6x2 volume, the issue happens in the 1st run of test7.

Comment 4 Niels de Vos 2014-10-29 12:23:12 UTC
This also reproduces with a volume that consists out of one brick. That should be easier to debug :)

Comment 5 Vijaikumar Mallikarjuna 2014-10-29 12:47:17 UTC
We will hit this issue with the below test-case as well.
I will look at the test-case mentioned in the description to check if it does something similar to the test-case below:

1) Create volume and enable quota with 1GB usage limit
2) Create a file with 600MB
3) rename this file.
   The command is successful but rename fails with quota exceeded.
   Patch http://review.gluster.org/#/c/8940/ fixes the issue if ancestor (with has limit set) of src and dest files are same.
   

I think problem here is even though rename failed at back-end we are returning success to the client. I will do more investigation on this and update the bug again.

Comment 8 Vijaikumar Mallikarjuna 2014-10-30 09:46:13 UTC
The issue is happening in latest 2.1 code, but works in latest upstream code.
I am investigating it in 2.1 code and will update the bug once I find something.

Comment 9 Vijaikumar Mallikarjuna 2014-10-30 12:10:24 UTC
unlink on a file which has more than 1 link failed when quota is enabled, hence the problem.

Comment 10 Vijaikumar Mallikarjuna 2014-10-31 09:47:20 UTC
Patch submitted: https://code.engineering.redhat.com/gerrit/35825

Comment 11 Saurabh 2014-11-10 12:52:58 UTC
Executed the ctho test suite while quota was enabled and the limits being set.

Issue has not happend on the build, glusterfs-3.4.0.70rhs-1.el6rhs.x86_64

Comment 13 errata-xmlrpc 2014-11-13 12:23:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1853.html