999458 – quota: limit gets crossed for small values of "hard limit" setting

Bug 999458 - quota: limit gets crossed for small values of "hard limit" setting

Summary: quota: limit gets crossed for small values of "hard limit" setting

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	quota
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Susant Kumar Palai
QA Contact:	Saurabh
Docs Contact:
URL:
Whiteboard:
Depends On:	1182890 1182921
Blocks:	1020127
TreeView+	depends on / blocked

Reported:	2013-08-21 11:07 UTC by Saurabh
Modified:	2016-09-19 22:06 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	Expected behavior of quota: If the rate of I/O is more than the value of hard-timeout and soft-timeout, there is possibility of quota limit being exceeded For example: If the rate of IO is 1GB/sec If hard-timeout is set to 5sec (default value). If soft-timeout is set to 60sec (default value). Then we may exceed quota limit by ~30GB - 60GB In order to attain a strict checking of quota limit, then you need to lower the value of hard-timeout and soft-timeout Command to set timeout: gluster volume quota <volume-name> soft-timeout 0 gluster volume quota <volume-name> hard-timeout 0
Clone Of:
Environment:
Last Closed:	2015-01-16 08:10:32 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Saurabh 2013-08-21 11:07:33 UTC

Description of problem:
if I set the hard limit to a small lets say 1MB to a directory
and start creating data in a for loop the limit can get crossed with that same for loop.

[root@rhsauto032 ~]# gluster volume info dist-rep3

Volume Name: dist-rep3
Type: Distributed-Replicate
Volume ID: 6aaeda5c-b6f6-42c2-8003-b4035f62085b
Status: Started
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d1r1-3
Brick2: rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d1r2-3
Brick3: rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d2r1-3
Brick4: rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d2r2-3
Brick5: rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d3r1-3
Brick6: rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d3r2-3
Brick7: rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d4r1-3
Brick8: rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d4r2-3
Brick9: rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d5r1-3
Brick10: rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d5r2-3
Brick11: rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d6r1-3
Brick12: rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d6r2-3
Options Reconfigured:
features.alert-time: 10s
features.quota: on


[root@rhsauto032 ~]# gluster volume status dist-rep3
Status of volume: dist-rep3
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d1r
1-3                                                     49167   Y       11530
Brick rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d1r
2-3                                                     49170   Y       31979
Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d2r
1-3                                                     49170   Y       13829
Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d2r
2-3                                                     49170   Y       13832
Brick rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d3r
1-3                                                     49168   Y       11541
Brick rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d3r
2-3                                                     49171   Y       31990
Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d4r
1-3                                                     49171   Y       13840
Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d4r
2-3                                                     49171   Y       13843
Brick rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d5r
1-3                                                     49169   Y       11552
Brick rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d5r
2-3                                                     49172   Y       32001
Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d6r
1-3                                                     49172   Y       13851
Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d6r
2-3                                                     49172   Y       13854
NFS Server on localhost                                 2049    Y       13561
Self-heal Daemon on localhost                           N/A     Y       11574
Quota Daemon on localhost                               N/A     Y       15121
NFS Server on rhsauto034.lab.eng.blr.redhat.com         2049    Y       15301
Self-heal Daemon on rhsauto034.lab.eng.blr.redhat.com   N/A     Y       13871
Quota Daemon on rhsauto034.lab.eng.blr.redhat.com       N/A     Y       16194
NFS Server on rhsauto033.lab.eng.blr.redhat.com         2049    Y       1024
Self-heal Daemon on rhsauto033.lab.eng.blr.redhat.com   N/A     Y       32021
Quota Daemon on rhsauto033.lab.eng.blr.redhat.com       N/A     Y       2024
NFS Server on rhsauto035.lab.eng.blr.redhat.com         2049    Y       15293
Self-heal Daemon on rhsauto035.lab.eng.blr.redhat.com   N/A     Y       13881
Quota Daemon on rhsauto035.lab.eng.blr.redhat.com       N/A     Y       16315
 
There are no active volume tasks


Version-Release number of selected component (if applicable):
glusterfs-server-3.4.0.20rhsquota1-1.el6.x86_64
glusterfs-fuse-3.4.0.20rhsquota1-1.el6.x86_64
glusterfs-3.4.0.20rhsquota1-1.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. create a volume of 6x2 type, start it.
2. enable quota
3. mount over nfs
4. create a directory
5. set limit of 1MB on the directory
6. start creating data in the directory,
#for i in {21..1040}; do dd if=/dev/urandom of=file$i bs=100KB count=1; done

Actual results:
[root@rhsauto032 ~]# gluster volume quota dist-rep3 list /dir1
                  Path                   Hard-limit Soft-limit   Used  Available
--------------------------------------------------------------------------------
/dir1                                      1.0MB       80%       2.0MB  0Bytes


Expected results:
limit is not suppose to cross for any value of hard limit set.

Additional info:

though same behaviour is not seen for large values like 1GB, till now.

Comment 1 Saurabh 2013-08-21 11:09:00 UTC

xattrs from node1,
[root@rhsauto032 ~]# getfattr -m . -d -e hex /rhs/bricks/d1r1-3/dir1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d1r1-3/dir1/
trusted.afr.dist-rep3-client-0=0x000000000000000000000000
trusted.afr.dist-rep3-client-1=0x000000000000000000000000
trusted.gfid=0x4ec9311df3a8434db0c17c823555cdc3
trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000018800
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x00000000001000000000000000000000
trusted.glusterfs.quota.size=0x0000000000018800

[root@rhsauto032 ~]# 
[root@rhsauto032 ~]# 
[root@rhsauto032 ~]# getfattr -m . -d -e hex /rhs/bricks/d3r1-3/dir1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d3r1-3/dir1/
trusted.afr.dist-rep3-client-4=0x000000000000000000000000
trusted.afr.dist-rep3-client-5=0x000000000000000000000000
trusted.gfid=0x4ec9311df3a8434db0c17c823555cdc3
trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x00000000000e5800
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x00000000001000000000000000000000
trusted.glusterfs.quota.size=0x00000000000e5800

[root@rhsauto032 ~]# getfattr -m . -d -e hex /rhs/bricks/d5r1-3/dir1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d5r1-3/dir1/
trusted.gfid=0x4ec9311df3a8434db0c17c823555cdc3
trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x00000000001000000000000000000000
trusted.glusterfs.quota.size=0x0000000000000000




xattrs from node3,
[root@rhsauto034 bricks]# getfattr -m . -d -e hex /rhs/bricks/d2r1-3/dir1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d2r1-3/dir1/
trusted.afr.dist-rep3-client-2=0x000000000000000000000000
trusted.afr.dist-rep3-client-3=0x000000000000000000000000
trusted.gfid=0x4ec9311df3a8434db0c17c823555cdc3
trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x000000000010d800
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x00000000001000000000000000000000
trusted.glusterfs.quota.size=0x000000000010d800

[root@rhsauto034 bricks]# getfattr -m . -d -e hex /rhs/bricks/d4r1-3/dir1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d4r1-3/dir1/
trusted.gfid=0x4ec9311df3a8434db0c17c823555cdc3
trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x00000000001000000000000000000000
trusted.glusterfs.quota.size=0x0000000000000000

[root@rhsauto034 bricks]# getfattr -m . -d -e hex /rhs/bricks/d6r1-3/dir1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d6r1-3/dir1/
trusted.gfid=0x4ec9311df3a8434db0c17c823555cdc3
trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x00000000001000000000000000000000
trusted.glusterfs.quota.size=0x0000000000000000

[root@rhsauto034 bricks]#

Comment 4 Raghavendra G 2013-08-23 04:17:57 UTC

Varun,

Can you check whether this behaviour is seen,

1. in a single brick setup.
2. when hard and soft timeouts of quota set to zero.

The reason for doing above two tests is that:

1. In a distributed setup, there is a time window where a brick reports its size and the time when aggregated size of the directory from all bricks reaches quota enforcer. Whatever writes that happen on other bricks (apart from the one where enforcer has requested for aggregated size) during this time window are not accounted by enforcer. Test 1 helps us to confirm whether we are hitting this issue.

2. With caching of sizes in enforcer for a timeout period, an enforcer might be missing writes happening on other nodes. Both test case 1 and 2 will help us to find out whether this is the cause.

regards,
Raghavendra.

Comment 5 Raghavendra G 2013-08-27 04:39:28 UTC

Is this issue still seen in Build 3? Some fixes to accounting have gone in Build 3.

Comment 8 Pranith Kumar K 2013-10-17 10:36:42 UTC

When I tried the same steps using soft-timeout, hard-timeout to 0. The test case works as expected.

Output from my run:
root@pranith-vm2 - /mnt/r2/dir1 
03:36:03 :) ⚡ for i in {21..1040}; do dd if=/dev/urandom of=file$i bs=100KB count=1; done
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0642977 s, 1.6 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.061704 s, 1.6 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0615557 s, 1.6 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0660609 s, 1.5 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0660485 s, 1.5 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0617839 s, 1.6 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0645328 s, 1.5 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0661677 s, 1.5 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0650025 s, 1.5 MB/s
1+0 records in
1+0 records out
100000 bytes (100 kB) copied, 0.0658353 s, 1.5 MB/s
dd: closing output file `file31': Disk quota exceeded

....... rest of them also fail with Disk quota exceeded.

root@pranith-vm2 - /mnt/r2/dir1 
03:36:25 :( ⚡ gluster volume quota r2 list
                  Path                   Hard-limit Soft-limit   Used  Available
--------------------------------------------------------------------------------
/dir1                                      1.0MB       80%       1.0MB  0Bytes

root@pranith-vm2 - /mnt/r2/dir1 
03:36:28 :) ⚡ gluster volume info
 
Volume Name: r2
Type: Distributed-Replicate
Volume ID: c2f28d51-ca0c-4e3a-b1bb-16d39ea74fa7
Status: Started
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.42.237:/brick/r2_0
Brick2: 10.70.42.237:/brick/r2_1
Brick3: 10.70.42.237:/brick/r2_2
Brick4: 10.70.42.237:/brick/r2_3
Brick5: 10.70.42.237:/brick/r2_4
Brick6: 10.70.42.237:/brick/r2_5
Brick7: 10.70.42.237:/brick/r2_6
Brick8: 10.70.42.237:/brick/r2_7
Brick9: 10.70.42.237:/brick/r2_8
Brick10: 10.70.42.237:/brick/r2_9
Brick11: 10.70.42.237:/brick/r2_10
Brick12: 10.70.42.237:/brick/r2_11
Options Reconfigured:
features.soft-timeout: 0
features.hard-timeout: 0
features.quota: on

We need to document this to set the right expectation to the user in the admin guide.

Comment 10 Bhavana 2013-10-23 08:45:01 UTC

Based on my discussion with Pranith and KP, required information is added as a note in the Admin guide, section 12.5.

http://documentation-devel.engineering.redhat.com/docs/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/ch12s05.html

Comment 13 Pavithra 2013-11-12 05:29:25 UTC

After a conversation with Du and Saurabh, I am updating the doc text for known issue.

Comment 14 Vivek Agarwal 2013-11-14 11:27:10 UTC

Moving the known issues to Doc team, to be documented in release notes for U1

Comment 15 Vivek Agarwal 2013-11-14 11:29:04 UTC

Moving the known issues to Doc team, to be documented in release notes for U1

Comment 16 Vivek Agarwal 2013-11-14 11:29:41 UTC

Moving the known issues to Doc team, to be documented in release notes for U1

Comment 17 Pavithra 2013-11-25 06:45:10 UTC

I've documented this as a known issue in the BB U1 Release Notes. Here is the link:
http://documentation-devel.engineering.redhat.com/docs/en-US/Red_Hat_Storage/2.1/html/2.1_Update_1_Release_Notes/chap-Documentation-2.1_Update_1_Release_Notes-Known_Issues.html

Note You need to log in before you can comment on or make changes to this bug.