Bug 1077159 - Quota hard limit not being enforced consistently
Summary: Quota hard limit not being enforced consistently
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: quota
Version: 3.5.0
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Nagaprasad Sathyanarayana
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-17 11:39 UTC by Justin Clift
Modified: 2016-02-18 00:20 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Clone Of:
Environment:
Last Closed: 2014-11-11 08:28:47 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
quota.t from master branch, modified to show the output from each iteration of the hard limit testing loop (5.15 KB, application/x-shellscript)
2014-03-17 11:42 UTC, Justin Clift
no flags Details

Description Justin Clift 2014-03-17 11:39:09 UTC
Description of problem:

  The quota hard limit is not being enforced correctly in upstream GlusterFS 3.5
  and git master branch.

  The bug means Gluster does not stop writing after the hard quota limit is
  reached. :(

  This shows up when running tests/basic/quota.t as a failure in test 64 or 65
  (depending on version of test script).

  The problem isn't consistent though.  Sometimes several runs of quota.t (eg
  3 or 4 runs) will work perfectly, then the next few will fail, then it'll
  start working again for a while, then start failing again, and so on.

  Modifying the normal quota.t, so it shows each iteration of the problem loop
  makes things very obvious.

  Here's an example of things working as they should:

  **************************
  $ sudo tests/basic/quota.t
  <snip>
  ok 61
  gluster --mode=script --wignore volume add-brick patchy justin-f19-test1:/d/backends/brick3 justin-f19-test1:/d/backends/brick4
  ok 62
  gluster --mode=script --wignore volume rebalance patchy start
  ok 63
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile1
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile2
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile3
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile4
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile5
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile6
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile7
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile8
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile9
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile10
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile11
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile12
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile13
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile14
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile15
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile16
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile17
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile18
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile19
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile20
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile21
  <snipped further failures until end of loop>
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile194
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile195
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile196
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile197
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile198
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile199
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile200
  [ 1024 -le 1126 ]
  ok 64
  <snip>
  **************************

  When the limit doesn't work correctly, this happens:

  **************************
  $ sudo tests/basic/quota.t
  <snip>
  ok 61
  gluster --mode=script --wignore volume add-brick patchy justin-f19-test1:/d/backends/brick3 justin-f19-test1:/d/backends/brick4
  ok 62
  gluster --mode=script --wignore volume rebalance patchy start
  ok 63
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile1
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile2
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile3
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile4
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile5
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile6
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile7
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile8
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile9
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile10
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile11
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile12
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile13
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile14
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile15
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile16
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile17
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile18
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile19
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile20
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile21
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile22
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile23
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile24
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile25
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile26
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile27
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile28
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile29
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile30
  <snipping lots more success/failures>
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile192
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile193
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile194
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile195
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile196
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile197
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile198
  WRITING SUCCEEDED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile199
  WRITING FAILED: /mnt/glusterfs/0/addbricktest/dir1/10MBfile200
  [ 1664 -le 1126 ]
  not ok 64
  <snip>
  **************************

  I've created a Fedora 19 server in Rackspace which shows up the problem,
  for Varun and the team can investigate as needed.  (details sent in
  private email already).


Version-Release number of selected component (if applicable):

  Upstream GlusterFS git release-3.5 and master branches, as of Mon 17th
  March 2014.


How reproducible:

  Inconsistent.  But running the modified quota.t file on either release-3.5
  or master branches in a loop (say 10 times) will exhibit the problem
  occasionally.


Steps to Reproduce:
1. Build RPMs from release-3.5 or master.
2. Copy the attached modified quota.t test file over the existing tests/basic/quota.t
3. From the root of the checked out glusterfs git repo, run:

   $ sudo tests/basic/quota.t

Comment 1 Justin Clift 2014-03-17 11:42:24 UTC
Created attachment 875447 [details]
quota.t from master branch, modified to show the output from each iteration of the hard limit testing loop

Comment 2 Justin Clift 2014-03-19 10:55:53 UTC
Confirmed in email from Varun:

***************************************************************

Hey Justin,

I found the cause that why it happens. The problem lies in DHT/rebalance code.

Cause:
trusted.glusterfs.quota.limit-set is the quota configuration xattrs which stores this on the directories with quota limits. When there is an add-brick (as same as in my test case quota.t) its DHT's responsibility to copy the xattrs to the directory created in the newly added brick. But DHT/rebalance fails to heal for some reason which I suspect on md-cache or readdirp for lack of lookups on the directory. I need to contract DHT guys for further query/solution on the same.

Cause for intermittent Failures and successes:
Which ever file hashes to the newly added brick which doesn't have the configuration in the xattrs allows the writes and the old bricks doesn't.

- Varun Shastry

PS: I tried the same without the add-brick for around 10 times and its working fine.

***************************************************************

Comment 3 Justin Clift 2014-03-21 13:40:08 UTC
As a data point, this problem doesn't seem to exist in 3.4.  Tested the quota.t checking code in a loop, and no instances of the problem showed up.

Comment 4 vpshastry 2014-04-02 07:53:20 UTC
Comment #2 is based on the assumption that DHT in its fop call path heals the xattrs. After reading the code I came to know that only rebalance code heals the xattrs.
So, in the quota.t test script the rebalance succeeds before the further writes only few times. Whenever it fails to heal the xattrs before the next write instructions in the script, we see the quota exceeding the limit. The ideal solution is to wait for the rebalance to complete before start further writing.
So, I'm modifying the script to wait for the rebalance process to complete.

Comment 5 Anand Avati 2014-04-02 07:53:54 UTC
REVIEW: http://review.gluster.org/7380 (tests/quota: Wait till the rebalance is complete) posted (#1) for review on master by Varun Shastry (vshastry)

Comment 6 Anand Avati 2014-04-02 14:53:24 UTC
COMMIT: http://review.gluster.org/7380 committed in master by Vijay Bellur (vbellur) 
------
commit 1c1b8269d994c0885d753c8f0da8d5154876c7ae
Author: Varun Shastry <vshastry>
Date:   Tue Mar 25 09:36:45 2014 +0530

    tests/quota: Wait till the rebalance is complete
    
    Change-Id: Ia6f0c81fb1542ce1de965a69a61535691df056c3
    BUG: 1077159
    Signed-off-by: Varun Shastry <vshastry>
    Reviewed-on: http://review.gluster.org/7380
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Justin Clift <justin>
    Tested-by: Justin Clift <justin>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 7 Niels de Vos 2014-04-17 14:38:20 UTC
We do not seem to have a backport to the release-3.5 branch yet? Moving to glusterfs-3.5.1 and setting to ASSIGNED.

Comment 8 Niels de Vos 2014-05-20 15:32:44 UTC
Justin, I do not think this issue applies to 3.5. The quota.t test in 3.5 does not seem have the add-brick scenario. If you really have this problem with quota.t on 3.5, please file a separate bug, this one is now only for the master branch. Thanks!

Comment 10 Niels de Vos 2014-09-22 12:36:51 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 11 Niels de Vos 2014-11-11 08:28:47 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.