Description of problem: The quota hard limit is not being enforced correctly in upstream GlusterFS 3.5 and git master branch. The bug means Gluster does not stop writing after the hard quota limit is reached. :( This shows up when running tests/basic/quota.t as a failure in test 64 or 65 (depending on version of test script). The problem isn't consistent though. Sometimes several runs of quota.t (eg 3 or 4 runs) will work perfectly, then the next few will fail, then it'll start working again for a while, then start failing again, and so on. Modifying the normal quota.t, so it shows each iteration of the problem loop makes things very obvious. Here's an example of things working as they should: ************************** $ sudo tests/basic/quota.t <snip> ok 61 gluster --mode=script --wignore volume add-brick patchy justin-f19-test1:/d/backends/brick3 justin-f19-test1:/d/backends/brick4 ok 62 gluster --mode=script --wignore volume rebalance patchy start ok 63 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile1 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile2 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile3 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile4 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile5 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile6 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile7 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile8 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile9 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile10 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile11 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile12 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile13 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile14 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile15 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile16 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile17 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile18 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile19 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile20 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile21 <snipped further failures until end of loop> WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile194 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile195 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile196 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile197 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile198 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile199 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile200 [ 1024 -le 1126 ] ok 64 <snip> ************************** When the limit doesn't work correctly, this happens: ************************** $ sudo tests/basic/quota.t <snip> ok 61 gluster --mode=script --wignore volume add-brick patchy justin-f19-test1:/d/backends/brick3 justin-f19-test1:/d/backends/brick4 ok 62 gluster --mode=script --wignore volume rebalance patchy start ok 63 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile1 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile2 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile3 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile4 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile5 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile6 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile7 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile8 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile9 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile10 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile11 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile12 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile13 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile14 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile15 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile16 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile17 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile18 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile19 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile20 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile21 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile22 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile23 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile24 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile25 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile26 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile27 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile28 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile29 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile30 <snipping lots more success/failures> WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile192 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile193 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile194 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile195 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile196 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile197 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile198 WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile199 WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile200 [ 1664 -le 1126 ] not ok 64 <snip> ************************** I've created a Fedora 19 server in Rackspace which shows up the problem, for Varun and the team can investigate as needed. (details sent in private email already). Version-Release number of selected component (if applicable): Upstream GlusterFS git release-3.5 and master branches, as of Mon 17th March 2014. How reproducible: Inconsistent. But running the modified quota.t file on either release-3.5 or master branches in a loop (say 10 times) will exhibit the problem occasionally. Steps to Reproduce: 1. Build RPMs from release-3.5 or master. 2. Copy the attached modified quota.t test file over the existing tests/basic/quota.t 3. From the root of the checked out glusterfs git repo, run: $ sudo tests/basic/quota.t
Created attachment 875447 [details] quota.t from master branch, modified to show the output from each iteration of the hard limit testing loop
Confirmed in email from Varun: *************************************************************** Hey Justin, I found the cause that why it happens. The problem lies in DHT/rebalance code. Cause: trusted.glusterfs.quota.limit-set is the quota configuration xattrs which stores this on the directories with quota limits. When there is an add-brick (as same as in my test case quota.t) its DHT's responsibility to copy the xattrs to the directory created in the newly added brick. But DHT/rebalance fails to heal for some reason which I suspect on md-cache or readdirp for lack of lookups on the directory. I need to contract DHT guys for further query/solution on the same. Cause for intermittent Failures and successes: Which ever file hashes to the newly added brick which doesn't have the configuration in the xattrs allows the writes and the old bricks doesn't. - Varun Shastry PS: I tried the same without the add-brick for around 10 times and its working fine. ***************************************************************
As a data point, this problem doesn't seem to exist in 3.4. Tested the quota.t checking code in a loop, and no instances of the problem showed up.
Comment #2 is based on the assumption that DHT in its fop call path heals the xattrs. After reading the code I came to know that only rebalance code heals the xattrs. So, in the quota.t test script the rebalance succeeds before the further writes only few times. Whenever it fails to heal the xattrs before the next write instructions in the script, we see the quota exceeding the limit. The ideal solution is to wait for the rebalance to complete before start further writing. So, I'm modifying the script to wait for the rebalance process to complete.
REVIEW: http://review.gluster.org/7380 (tests/quota: Wait till the rebalance is complete) posted (#1) for review on master by Varun Shastry (vshastry)
COMMIT: http://review.gluster.org/7380 committed in master by Vijay Bellur (vbellur) ------ commit 1c1b8269d994c0885d753c8f0da8d5154876c7ae Author: Varun Shastry <vshastry> Date: Tue Mar 25 09:36:45 2014 +0530 tests/quota: Wait till the rebalance is complete Change-Id: Ia6f0c81fb1542ce1de965a69a61535691df056c3 BUG: 1077159 Signed-off-by: Varun Shastry <vshastry> Reviewed-on: http://review.gluster.org/7380 Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Justin Clift <justin> Tested-by: Justin Clift <justin> Reviewed-by: Vijay Bellur <vbellur>
We do not seem to have a backport to the release-3.5 branch yet? Moving to glusterfs-3.5.1 and setting to ASSIGNED.
Justin, I do not think this issue applies to 3.5. The quota.t test in 3.5 does not seem have the add-brick scenario. If you really have this problem with quota.t on 3.5, please file a separate bug, this one is now only for the master branch. Thanks!
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users