Description of problem: In DHT if one or more bricks are down and Directory is created then when those bricks come up, On look up directory should be self healed on previously down brick. Right now it is healing Directory but all xattr related to quota are not healed. When directory is created by self heal on down bricks it has following quota xattr :- trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri trusted.glusterfs.quota.dirty trusted.glusterfs.quota.sizz trusted.glusterfs.quota.limit-set xattr is missing. As a result after rebalance when this directory has layout but no quota limit set, it will violet quota limit . Bug 1002885 might depend on this bug (not sure but seems so) Version-Release number of selected component (if applicable): Big bend RHS ISO + glusterfs-*-3.4.0.35rhs-1.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. create a DHT colume and fuse mount it. enable quota. #[root@rhs-client22 rpm]# mount -t glusterfs 10.70.37.204:/xattr /mnt/xattr #gluster volume quoa xattr enable 2. bring one or more subvolume down by lilling process and create directory from mount point [root@rhs-client22 xattr]# mkdir down 3.now set quota limit for newly created directory # gluster volume quota xattr limit-usage /down 50MB 50% 4. bring all bricks by gluster volume start <volname> force 5. perform lookup on newly created directory from mout point, which should self heal directory on all bricks 6. check xattr for that directory on previously down brick down brick:- [root@4VM5 rpm]# getfattr -d -m . -e hex /rhs/brick1/x1/down getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/x1/down trusted.gfid=0x48f5e5bcd0a9417aa53723fcedec6893 trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000000000000 up brick:- # file: rhs/brick1/x2/down trusted.gfid=0x48f5e5bcd0a9417aa53723fcedec6893 trusted.glusterfs.dht=0x00000001000000007fffffffffffffff trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.limit-set=0x00000000032000000000000000000032 <-------- trusted.glusterfs.quota.size=0x0000000000000000 Actual results: It is not healing all the attribute related to quota. trusted.glusterfs.quota.limit-set is not healed. As a result after rebalance when this directory has layout but no quota limit set, it will violet quota limit Expected results: It should copy all xattr Additional info:
After discussing with Shishir about the issue we came to conclusion that plain distribute setup never guaranteed data availability when one of the bricks is down. Reason being dht does not keep track of which subvolume is the 'source' and which subvolume is stale. The other quota xattrs apart from 'quota-limit' are not 'healed' by dht but created by the quota/marker xlators.
Reason for reopening the same bug is, I think root-cause is same, dht is not healing xattr so the same bug is coming in dist-rep also. Now the bug is for dist-rep volume, sequence of action will remain same with few modification like rather than bringing on brick down we have to bring one replica down. So better to write steps again Description of problem: In dist-rep volume if one or more replica pairs are down and User creates Directory then when those bricks come up, On look up directory should be self healed on previously down replica pair Right now it is healing Directory but all xattr related to quota are not healed. When directory is created by self heal on down bricks(replica pair) it has following quota xattr :- trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri trusted.glusterfs.quota.dirty trusted.glusterfs.quota.sizz trusted.glusterfs.quota.limit-set xattr is missing. As a result after rebalance when this directory has layout but no quota limit set, it will violet quota limit . Bug 1002885 might depend on this bug (not sure but seems so) As in bug 1002885 we are adding new replica pair. self heal should create directory and xattr. As it is not done, even after rebalance xattr is missing Version-Release number of selected component (if applicable): Big bend RHS ISO + glusterfs-*-3.4.0.35rhs-1.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. create a Dist-rep volume(2x2) and fuse mount it. enable quota. #[root@rhs-client22 rpm]# mount -t glusterfs 10.70.37.204:/dist-rept /mnt/dist-rept #gluster volume quota dist-rept enable 2. a.bring one replica pair down by killing process [root@4VM5 rpm]# gluster volume status dist-rept Status of volume: dist-rept Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.37.204:/rhs/brick1/r1 49160 Y 1078 Brick 10.70.37.142:/rhs/brick1/r1 49159 Y 872 Brick 10.70.37.204:/rhs/brick1/r2 49162 Y 1089 Brick 10.70.37.142:/rhs/brick1/r2 49160 Y 883 NFS Server on localhost 2049 Y 1101 Self-heal Daemon on localhost N/A Y 1110 Quota Daemon on localhost N/A Y 1377 NFS Server on 10.70.37.142 2049 Y 896 Self-heal Daemon on 10.70.37.142 N/A Y 904 Quota Daemon on 10.70.37.142 N/A Y 1188 There are no active volume tasks server 1:- [root@4VM5 rpm]# kill -9 1078 server 2:- [root@4VM6 rpm]# kill -9 872 and create directory from mount point [root@rhs-client22 dist-rep]# cd /mnt/dist-rept [root@rhs-client22 dist-rept]# mkdir down 3.now set quota limit for newly created directory [root@4VM6 rpm]# gluster volume quota dist-rept limit-usage /down 50MB 50% volume quota : success 4. bring all bricks up by gluster volume start <volname> force [root@4VM5 rpm]# gluster volume start dist-rept force volume start: dist-rept: success 5. perform lookup on newly created directory from mout point, which should self heal directory on all bricks 6. also run heal command for volume [root@4VM5 rpm]# gluster volume heal dist-rept Launching heal operation to perform index self heal on volume dist-rept has been successful Use heal info commands to check status full heal also [root@4VM5 rpm]# gluster volume heal dist-rept full Launching heal operation to perform full self heal on volume dist-rept has been successful Use heal info commands to check status 7. check xattr for the directory on previously down bricks- replica pair down bricks/ down replica pair:- [root@4VM6 rpm]# getfattr -d -m . -e hex /rhs/brick1/r1/down getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/r1/down trusted.afr.dist-rept-client-0=0x000000000000000000000000 trusted.afr.dist-rept-client-1=0x000000000000000000000000 trusted.gfid=0x010f34d3b5104a25b516d050adc6d01b trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000000000000 [root@4VM5 rpm]# getfattr -d -m . -e hex /rhs/brick1/r1/down getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/r1/down trusted.afr.dist-rept-client-0=0x000000000000000000000000 trusted.afr.dist-rept-client-1=0x000000000000000000000000 trusted.gfid=0x010f34d3b5104a25b516d050adc6d01b trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000000000000 up bricks/ up replica pair:- [root@4VM6 rpm]# getfattr -d -m . -e hex /rhs/brick1/r2/down getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/r2/down trusted.gfid=0x010f34d3b5104a25b516d050adc6d01b trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.limit-set=0x00000000032000000000000000000032 trusted.glusterfs.quota.size=0x0000000000000000 [root@4VM5 rpm]# getfattr -d -m . -e hex /rhs/brick1/r2/down getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/r2/down trusted.gfid=0x010f34d3b5104a25b516d050adc6d01b trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.limit-set=0x00000000032000000000000000000032 trusted.glusterfs.quota.size=0x0000000000000000 Actual results: It is not healing all the attribute related to quota. trusted.glusterfs.quota.limit-set is not healed. Expected results: It should copy all xattr
Pranith, Can you please review the doc text for technical accuracy?
Moving the known issues to Doc team, to be documented in release notes for U1
This is documented as a known issue in the Big Bend Update 1 Release Notes. Here is the link: http://documentation-devel.engineering.redhat.com/docs/en-US/Red_Hat_Storage/2.1/html/2.1_Update_1_Release_Notes/chap-Documentation-2.1_Update_1_Release_Notes-Known_Issues.html