985203 – quota: xattrs corrupted

Bug 985203 - quota: xattrs corrupted

Summary: quota: xattrs corrupted

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	vpshastry
QA Contact:	Saurabh
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-07-17 05:47 UTC by Saurabh
Modified:	2016-01-19 06:12 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.4.0.34rhs
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-10-16 03:47:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Saurabh 2013-07-17 05:47:16 UTC

Description of problem:
created a data for around 10000 dirs and each dir having 100 subdirs with 1 file in each of the subdirs. 

the quota list responses the Used space is "zero"

later found out that the xattr are curropted on one of the node.

Volume Name: dist-rep
Type: Distributed-Replicate
Volume ID: 56c96b65-b18a-4a1c-95fa-e82be6d2f0a0
Status: Started
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.37.180:/rhs/bricks/d1r1
Brick2: 10.70.37.80:/rhs/bricks/d1r2
Brick3: 10.70.37.216:/rhs/bricks/d2r1
Brick4: 10.70.37.139:/rhs/bricks/d2r2
Brick5: 10.70.37.180:/rhs/bricks/d3r1
Brick6: 10.70.37.80:/rhs/bricks/d3r2
Brick7: 10.70.37.216:/rhs/bricks/d4r1
Brick8: 10.70.37.139:/rhs/bricks/d4r2
Brick9: 10.70.37.180:/rhs/bricks/d5r1
Brick10: 10.70.37.80:/rhs/bricks/d5r2
Brick11: 10.70.37.216:/rhs/bricks/d6r1
Brick12: 10.70.37.139:/rhs/bricks/d6r2
Options Reconfigured:
features.quota: on


[root@nfs1 ~]# gluster volume status
Volume santosh-dist-rep is not started
 
Status of volume: dist-rep
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.180:/rhs/bricks/d1r1                     49159   Y       3312
Brick 10.70.37.80:/rhs/bricks/d1r2                      49159   Y       760
Brick 10.70.37.216:/rhs/bricks/d2r1                     49159   Y       19955
Brick 10.70.37.139:/rhs/bricks/d2r2                     49160   Y       1024
Brick 10.70.37.180:/rhs/bricks/d3r1                     49160   Y       3323
Brick 10.70.37.80:/rhs/bricks/d3r2                      49160   Y       772
Brick 10.70.37.216:/rhs/bricks/d4r1                     49160   Y       19966
Brick 10.70.37.139:/rhs/bricks/d4r2                     49161   Y       1035
Brick 10.70.37.180:/rhs/bricks/d5r1                     49161   Y       3334
Brick 10.70.37.80:/rhs/bricks/d5r2                      49161   Y       784
Brick 10.70.37.216:/rhs/bricks/d6r1                     49161   Y       19977
Brick 10.70.37.139:/rhs/bricks/d6r2                     49162   Y       1048
NFS Server on localhost                                 2049    Y       3346
Self-heal Daemon on localhost                           N/A     Y       3355
NFS Server on 10.70.37.139                              2049    Y       1061
Self-heal Daemon on 10.70.37.139                        N/A     Y       1068
NFS Server on 10.70.37.216                              2049    Y       19989
Self-heal Daemon on 10.70.37.216                        N/A     Y       19995
NFS Server on 10.70.37.80                               2049    Y       796
Self-heal Daemon on 10.70.37.80                         N/A     Y       803
 
There are no active volume tasks


poiting to the .cmd_log_history,

[2013-07-16 01:17:01.964485]  : volume stop dist-rep : SUCCESS
[2013-07-16 01:17:11.437215]  : volume delete dist-rep : SUCCESS
[2013-07-16 01:18:39.591012]  : volume create dist-rep replica 2 10.70.37.180:/rhs/bricks/d1r1 10.70.37.80:/rhs/bricks/d1r2 10.70.37.216:/rhs/bricks/d2r1 10.70.37.139:/rhs/bricks/d2r2 10.70.37.180:/rhs/bricks/d3r1 10.70.37.80:/rhs/bricks/d3r2 10.70.37.216:/rhs/bricks/d4r1 10.70.37.139:/rhs/bricks/d4r2 10.70.37.180:/rhs/bricks/d5r1 10.70.37.80:/rhs/bricks/d5r2 10.70.37.216:/rhs/bricks/d6r1 10.70.37.139:/rhs/bricks/d6r2 : SUCCESS
[2013-07-16 01:19:45.523649]  : volume start dist-rep : SUCCESS
[2013-07-16 01:20:02.942292]  : volume status dist-rep : SUCCESS
[2013-07-16 01:20:44.614904]  : volume quota dist-rep limit-usage / 50GB : SUCCESS
[2013-07-16 01:20:50.237246]  : volume quota dist-rep enable : SUCCESS
[2013-07-16 01:20:51.845526]  : volume quota dist-rep limit-usage / 50GB : SUCCESS
[2013-07-16 01:20:55.914705]  : volume quota dist-rep list : SUCCESS
[2013-07-16 01:21:06.478489]  : volume quota dist-rep list : SUCCESS
[2013-07-16 01:54:56.155853]  : volume quota dist-rep list : SUCCESS
[2013-07-16 18:08:41.988000]  : volume status dist-rep : SUCCESS
[2013-07-16 18:09:07.594404]  : volume quota dist-rep list : SUCCESS
[2013-07-16 18:09:13.136238]  : volume quota dist-rep list : SUCCESS
[2013-07-16 18:09:55.717328]  : volume quota dist-rep list : SUCCESS
[2013-07-16 18:10:10.031726]  : volume quota dist-rep list : SUCCESS
[2013-07-16 18:13:36.136602]  : volume quota dist-rep list : SUCCESS


Version-Release number of selected component (if applicable):
[root@nfs1 ~]# rpm -qa | grep glusterfs
glusterfs-3.4.0.12rhs.beta4-1.el6rhs.x86_64
glusterfs-fuse-3.4.0.12rhs.beta4-1.el6rhs.x86_64
glusterfs-server-3.4.0.12rhs.beta4-1.el6rhs.x86_64

How reproducible:
tried to create this kind data and found this issue.

Steps to Reproduce:
1. create a volume, start it 
2. enable quota on it
3. set limit of 50 GB
4. mount volume over nfs
5. start creating data using data-create.py
[root@rhel6 ~]# cat data-create.py 
#!/usr/bin/python

import os
import commands

def create_data(mount_path_nfs):
    for i in range(88, 10000):
        os.mkdir(mount_path_nfs + "/" + "%d"%(i))
        for j in range(1, 100):
            os.mkdir(mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j))
            commands.getoutput("touch" + " " + mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j) + "/" + "%d"%(j) + ".file")

def main():
    mount_path_nfs = "/mnt/nfs-test"
    create_data(mount_path_nfs)

if __name__ == "__main__":
   main()
[root@rhel6 ~]# 

Actual results:

[root@nfs1 ~]# gluster volume quota dist-rep list
                  Path                   Hard-limit Soft-limit   Used  Available
--------------------------------------------------------------------------------
/                                           50GB       90%      0Bytes  50.0GB
[root@nfs1 ~]# 


result from the node having corrupted nodes,

[root@nfs2 bricks]# cd
[root@nfs2 ~]# getfattr -m . -d /rhs/bricks/d
d1r2/ d3r2/ d5r2/ dr/   
[root@nfs2 ~]# getfattr -m . -d /rhs/bricks/d1r2/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d1r2/
security.selinux="unconfined_u:object_r:file_t:s0"
trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
trusted.glusterfs.dht=0sAAAAAQAAAABVVVVUf////Q==
trusted.glusterfs.quota.dirty=0sMAA=
trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
trusted.glusterfs.volume-id=0sVslrZbGKShyV+ugr5tLwoA==

[root@nfs2 ~]# getfattr -m . -d /rhs/bricks/d13r2/
getfattr: /rhs/bricks/d13r2/: No such file or directory
[root@nfs2 ~]# getfattr -m . -d /rhs/bricks/d3r2/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d3r2/
security.selinux="unconfined_u:object_r:file_t:s0"
trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
trusted.glusterfs.dht=0sAAAAAQAAAACqqqqo1VVVUQ==
trusted.glusterfs.quota.dirty=0sMAA=
trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
trusted.glusterfs.volume-id=0sVslrZbGKShyV+ugr5tLwoA==

[root@nfs2 ~]# getfattr -m . -d /rhs/bricks/d5r2/
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/d5r2/
security.selinux="unconfined_u:object_r:file_t:s0"
trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
trusted.glusterfs.dht=0sAAAAAQAAAAAAAAAAKqqqqQ==
trusted.glusterfs.quota.dirty=0sMAA=
trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
trusted.glusterfs.volume-id=0sVslrZbGKShyV+ugr5tLwoA==

[root@nfs2 ~]# 


Expected results:
corruption of xattr is unacceptable

Additional info:

Comment 3 vpshastry 2013-09-04 07:36:57 UTC

I couldn't reproduce this. Can you verify this with the current build v3.4.0qa8?

Comment 6 Scott Haines 2013-10-23 03:07:35 UTC

Since the problem described in this bug report should be resolved in a recent
advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow
the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.