+++ This bug was initially created as a clone of Bug #1286036 +++ +++ This bug was initially created as a clone of Bug #863100 +++ Description of problem: DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down Version-Release number of selected component (if applicable): 3.3.0.3rhs-32.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. Create a Distributed volume having 3 or more sub-volumes on multiple server and start that volume. 2. Fuse Mount the volume from the client-1 using “mount -t glusterfs server:/<volume> <client-1_mount_point>” 3. From mount point create some dirs and files inside it. 4. Bring on sub-volume down [root@Rhs3 t1]# gluster volume status test Status of volume: test Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.81:/home/t1 24009 Y 18564 Brick 10.70.35.85:/home/t1 24211 Y 16174 Brick 10.70.35.86:/home/t1 24212 Y 2360 NFS Server on localhost 38467 Y 2366 NFS Server on 10.70.35.81 38467 Y 12929 NFS Server on 10.70.35.85 38467 Y 10226 [root@Rhs3 t1]# kill -9 2360 5. Custom atribute:- from mount point set custom attribute for directory and verify it on all server client [root@client test]# setfattr -n user.foo -v bar2 d1 [root@client test]# getfattr -n user.foo d1 # file: d1 user.foo="bar2" server1:- [root@Rhs1 t1]# getfattr -n user.foo d1 # file: d1 user.foo="bar2" server2:- [root@Rhs2 t1]# getfattr -n user.foo d1 # file: d1 user.foo="bar2" server3:- [root@Rhs3 t1]# getfattr -n user.foo d1 d1: user.foo: No such attribute 6.from mount point verify owner and group of dir and then modify it [root@client test]# stat d1 File: `d1' Size: 12 Blocks: 2 IO Block: 131072 directory Device: 15h/21d Inode: 10442536925251715313 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:57.006871636 +0530 Modify: 2012-10-04 12:10:57.006871636 +0530 Change: 2012-10-04 12:10:57.007864913 +0530 [root@client test]# chown u1 d1 [root@client test]# chgrp t1 d1 [root@client test]# stat d1 File: `d1' Size: 12 Blocks: 2 IO Block: 131072 directory Device: 15h/21d Inode: 10442536925251715313 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 500/ u1) Gid: ( 500/ t1) Access: 2012-10-04 12:10:57.006871636 +0530 Modify: 2012-10-04 12:10:57.006871636 +0530 Change: 2012-10-04 12:13:05.168865621 +0530 7.verify that it is updated on all sub-volume except the down one server1:- [root@Rhs1 t1]# stat d1 File: `d1' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 403116740 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 500/ UNKNOWN) Gid: ( 500/ UNKNOWN) Access: 2012-10-04 12:10:57.006871636 +0530 Modify: 2012-10-04 12:10:57.006871636 +0530 Change: 2012-10-04 12:13:05.168865621 +0530 server2:- [root@Rhs2 t1]# stat d1 File: `d1' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 134423062 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 500/ UNKNOWN) Gid: ( 500/ UNKNOWN) Access: 2012-10-04 12:10:56.807630951 +0530 Modify: 2012-10-04 12:10:56.807630951 +0530 Change: 2012-10-04 12:13:04.970323409 +0530 server3:- [root@Rhs3 t1]# stat d1 File: `d1' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 402655089 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:51.424531338 +0530 Modify: 2012-10-04 12:10:51.424531338 +0530 Change: 2012-10-04 12:10:51.610007436 +0530 8.from mount point verify Dir permission and modify it client [root@client test]# stat d2 File: `d2' Size: 12 Blocks: 2 IO Block: 131072 directory Device: 15h/21d Inode: 9860248238918728119 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:57.281276686 +0530 Modify: 2012-10-04 12:10:57.281276686 +0530 Change: 2012-10-04 12:10:57.282865329 +0530 [root@client test]# chmod 444 d2 [root@client test]# stat d2 File: `d2' Size: 12 Blocks: 2 IO Block: 131072 directory Device: 15h/21d Inode: 9860248238918728119 Links: 2 Access: (0444/dr--r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:57.281276686 +0530 Modify: 2012-10-04 12:10:57.281276686 +0530 Change: 2012-10-04 15:39:42.509942805 +0530 9. verify that it is updated on all sub-volume except the down one server1 [root@Rhs1 t1]# stat d2 File: `d2' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 134694359 Links: 2 Access: (0444/dr--r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:57.281276686 +0530 Modify: 2012-10-04 12:10:57.281276686 +0530 Change: 2012-10-04 15:39:42.509942805 +0530 server2 [root@Rhs1 t1]# stat d2 File: `d2' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 134694359 Links: 2 Access: (0444/dr--r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:57.281276686 +0530 Modify: 2012-10-04 12:10:57.281276686 +0530 Change: 2012-10-04 15:39:42.509942805 +0530 server3:- [root@Rhs3 t1]# stat d1 File: `d1' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 402655089 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:51.424531338 +0530 Modify: 2012-10-04 12:10:51.424531338 +0530 Change: 2012-10-04 12:10:51.610007436 +0530 10. now bring all sub-volumes up and perform lookup from client. Verify updated 'custom extended attributes',uid and gid, Access permission from client 11.verify access permission from sub-volume which was down previously [root@Rhs3 t1]# stat d2 File: `d2' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 134219638 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:51.700461206 +0530 Modify: 2012-10-04 12:10:51.700461206 +0530 Change: 2012-10-04 12:10:51.759989493 +0530 verify gid and uid on sub-volume which was down previously [root@Rhs3 t1]# stat d1 File: `d1' Size: 6 Blocks: 8 IO Block: 4096 directory Device: fc05h/64517d Inode: 402655089 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-10-04 12:10:51.424531338 +0530 Modify: 2012-10-04 12:10:51.424531338 +0530 Change: 2012-10-04 12:10:51.610007436 +0530 Verify custom extended attributes for that directory. server3:- [root@Rhs3 t1]# getfattr -n user.foo d1 d1: user.foo: No such attribute Actual results: mount point shows modified values but on subvolumes values are not consistent Expected results: Once sub-volume is up it should update values for 'custom extended attributes',uid and gid, Access permission (for directories). Additional info: --- Additional comment from RHEL Product and Program Management on 2012-10-04 17:55:42 MVT --- Since this issue was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from shishir gowda on 2012-10-19 09:21:57 MVT --- There are multiple issues in this bug. 1. For user related xattrs, we can not handle healing as in dht, we would not be able to identify the correct copy. The work around for this is a subsequent setxattr for the same key, which will fix the xattr mis-match 2. UID/GID: A fix is in progress (862967) 3. mis-matching permission: Will investigate it respond back --- Additional comment from shishir gowda on 2012-10-23 14:07:48 MVT --- Fix @ https://code.engineering.redhat.com/gerrit/#/c/150/ --- Additional comment from Amar Tumballi on 2013-02-15 17:11:59 MVT --- https://code.engineering.redhat.com/gerrit/#/c/1895/ --- Additional comment from Rachana Patel on 2013-03-19 18:09:46 MVT --- (In reply to comment #2) > There are multiple issues in this bug. > 1. For user related xattrs, we can not handle healing as in dht, we would > not be able to identify the correct copy. The work around for this is a > subsequent setxattr for the same key, which will fix the xattr mis-match > > 2. UID/GID: A fix is in progress (862967) > 3. mis-matching permission: Will investigate it respond back 1. If it is the case then, it should get documented. 2. It is dependent of bug 86296 and for that buf fixed in version is glusterfs-3.4.0qa5. So the same fix is available in latest build? 3. Could you please update on 3rd issue? what is the decision? --- Additional comment from Scott Haines on 2013-09-27 22:07:27 MVT --- Targeting for 3.0.0 (Denali) release. --- Additional comment from errata-xmlrpc on 2014-04-10 05:20:34 MVT --- This bug has been dropped from advisory RHEA-2014:17485 by Scott Haines (shaines) --- Additional comment from Susant Kumar Palai on 2014-05-22 11:04:45 MVT --- Here is a observation on the side effect of the patch in upstream : http://review.gluster.org/#/c/6983/. This patch works for all cases except one corner case. Currently we take the "permission info" for healing from a brick only if it has a layout xattr. Let say we add a new brick to a volume(newly added brick will not have a layout xattr) and all except the newly added brick goes down. Then we change the permission of the root directory. Now only the new brick has witnessed the new permission. If we bring all the bricks up, the older permission will be healed across all bricks as we dont take permission info if a brick does not have layout xattr. Here is the demo: brick1 brick permission(initial) 755 755 t0 UP ADDED BRICK t1 CHILD_DOWN t2 CHANGE PERMISSION LET SAY 777 on mount point t3 CHILD_UP t4 Heal 755 to all bricks as only this brick has layout xattr Final Permission after healing t5 755 755 -----> should have 777 $ Why not having a layout xattr for root of the brick for a newly added brick helps ? # If we assign a layout on root for the newly added brick, then as it will have the latest ctime it may corrupt the permission for the volume. example: brick1 brick2 brick3 t0 777(perm*) 777(perm*) Added as a brick will have perm* 755 by default t1 If it has layout & higher ctime, 755 will be healed for all Final Permission will be 755 instead of 777(bad) t2 755 755 755 Hence, creating a zeroed layout for root will create the above problem. $ Why healing on revalidate path is choosen ? # The reason we don't do metadata healing on fresh lookup path is, because directory selfheal is carried out on lookup path. Hence, once we do selfheal of directory on fresh lookup, we follow revalidate_cbk. And we will not be able to heal permission for that directory. So healing on the revalidate path is choosen. --- Additional comment from RHEL Product and Program Management on 2014-05-26 14:01:14 MVT --- This bug report previously had all acks and release flag approved. However since at least one of its acks has been changed, the release flag has been reset to ? by the bugbot (pm-rhel). The ack needs to become approved before the release flag can become approved again. --- Additional comment from Scott Haines on 2014-06-08 23:35:12 MVT --- Per engineering management on 06/06/2014, moving back to rhs-future backlog. --- Additional comment from errata-xmlrpc on 2014-06-20 14:20:27 MVT --- This bug has been dropped from advisory RHEA-2014:17485 by Vivek Agarwal (vagarwal) --- Additional comment from Nagaprasad Sathyanarayana on 2015-03-26 17:32:49 MVT --- After having triaged, it was agreed by all leads that this BZ can not be fixed for 3.1.0 release. --- Additional comment from John Skeoch on 2015-04-20 05:22:54 MVT --- User racpatel's account has been closed --- Additional comment from John Skeoch on 2015-04-20 05:25:23 MVT --- User racpatel's account has been closed --- Additional comment from Red Hat Bugzilla Rules Engine on 2015-11-27 05:15:34 EST --- This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Atin Mukherjee on 2017-02-23 09:11:17 EST --- upstream patch : http://review.gluster.org/15468 --- Additional comment from Red Hat Bugzilla Rules Engine on 2017-03-02 02:27:03 EST --- Since this bug has been approved for the RHGS 3.3.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.3.0+', and through the Internal Whiteboard entry of '3.3.0', the Target Release is being automatically set to 'RHGS 3.3.0' --- Additional comment from Atin Mukherjee on 2017-06-28 06:36:48 EDT --- At rhgs-3.3.0 pre devel freeze status meeting all the stakeholders agreed to defer this bug from rhgs-3.3.0 release. More details at http://post-office.corp.redhat.com/archives/gluster-storage-release-team/2017-June/msg00123.html
REVIEW: https://review.gluster.org/19157 (cluster/dht : User xattrs are not healed after brick stop/start) posted (#1) for review on release-3.12 by MOHIT AGRAWAL
Fix has been included in 3.13 and upward versions of GlusterFS releases. This bug is getting closed, as the patch that fixes this issue is large, and does not qualify to be backported to a stable release.