DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down
+++ This bug was initially created as a clone of Bug #1286036 +++
+++ This bug was initially created as a clone of Bug #863100 +++
Description of problem:
DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down
Version-Release number of selected component (if applicable):
3.3.0.3rhs-32.el6rhs.x86_64
How reproducible:
always
Steps to Reproduce:
1. Create a Distributed volume having 3 or more sub-volumes on multiple server and start that volume.
2. Fuse Mount the volume from the client-1 using “mount -t glusterfs server:/<volume> <client-1_mount_point>”
3. From mount point create some dirs and files inside it.
4. Bring on sub-volume down
[root@Rhs3 t1]# gluster volume status test
Status of volume: test
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.35.81:/home/t1 24009 Y 18564
Brick 10.70.35.85:/home/t1 24211 Y 16174
Brick 10.70.35.86:/home/t1 24212 Y 2360
NFS Server on localhost 38467 Y 2366
NFS Server on 10.70.35.81 38467 Y 12929
NFS Server on 10.70.35.85 38467 Y 10226
[root@Rhs3 t1]# kill -9 2360
5.
Custom atribute:-
from mount point set custom attribute for directory and verify it on all server
client
[root@client test]# setfattr -n user.foo -v bar2 d1
[root@client test]# getfattr -n user.foo d1
# file: d1
user.foo="bar2"
server1:-
[root@Rhs1 t1]# getfattr -n user.foo d1
# file: d1
user.foo="bar2"
server2:-
[root@Rhs2 t1]# getfattr -n user.foo d1
# file: d1
user.foo="bar2"
server3:-
[root@Rhs3 t1]# getfattr -n user.foo d1
d1: user.foo: No such attribute
6.from mount point verify owner and group of dir and then modify it
[root@client test]# stat d1
File: `d1'
Size: 12 Blocks: 2 IO Block: 131072 directory
Device: 15h/21d Inode: 10442536925251715313 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:57.006871636 +0530
Modify: 2012-10-04 12:10:57.006871636 +0530
Change: 2012-10-04 12:10:57.007864913 +0530
[root@client test]# chown u1 d1
[root@client test]# chgrp t1 d1
[root@client test]# stat d1
File: `d1'
Size: 12 Blocks: 2 IO Block: 131072 directory
Device: 15h/21d Inode: 10442536925251715313 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 500/ u1) Gid: ( 500/ t1)
Access: 2012-10-04 12:10:57.006871636 +0530
Modify: 2012-10-04 12:10:57.006871636 +0530
Change: 2012-10-04 12:13:05.168865621 +0530
7.verify that it is updated on all sub-volume except the down one
server1:-
[root@Rhs1 t1]# stat d1
File: `d1'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 403116740 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 500/ UNKNOWN) Gid: ( 500/ UNKNOWN)
Access: 2012-10-04 12:10:57.006871636 +0530
Modify: 2012-10-04 12:10:57.006871636 +0530
Change: 2012-10-04 12:13:05.168865621 +0530
server2:-
[root@Rhs2 t1]# stat d1
File: `d1'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 134423062 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 500/ UNKNOWN) Gid: ( 500/ UNKNOWN)
Access: 2012-10-04 12:10:56.807630951 +0530
Modify: 2012-10-04 12:10:56.807630951 +0530
Change: 2012-10-04 12:13:04.970323409 +0530
server3:-
[root@Rhs3 t1]# stat d1
File: `d1'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 402655089 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:51.424531338 +0530
Modify: 2012-10-04 12:10:51.424531338 +0530
Change: 2012-10-04 12:10:51.610007436 +0530
8.from mount point verify Dir permission and modify it
client
[root@client test]# stat d2
File: `d2'
Size: 12 Blocks: 2 IO Block: 131072 directory
Device: 15h/21d Inode: 9860248238918728119 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:57.281276686 +0530
Modify: 2012-10-04 12:10:57.281276686 +0530
Change: 2012-10-04 12:10:57.282865329 +0530
[root@client test]# chmod 444 d2
[root@client test]# stat d2
File: `d2'
Size: 12 Blocks: 2 IO Block: 131072 directory
Device: 15h/21d Inode: 9860248238918728119 Links: 2
Access: (0444/dr--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:57.281276686 +0530
Modify: 2012-10-04 12:10:57.281276686 +0530
Change: 2012-10-04 15:39:42.509942805 +0530
9. verify that it is updated on all sub-volume except the down one
server1
[root@Rhs1 t1]# stat d2
File: `d2'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 134694359 Links: 2
Access: (0444/dr--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:57.281276686 +0530
Modify: 2012-10-04 12:10:57.281276686 +0530
Change: 2012-10-04 15:39:42.509942805 +0530
server2
[root@Rhs1 t1]# stat d2
File: `d2'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 134694359 Links: 2
Access: (0444/dr--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:57.281276686 +0530
Modify: 2012-10-04 12:10:57.281276686 +0530
Change: 2012-10-04 15:39:42.509942805 +0530
server3:-
[root@Rhs3 t1]# stat d1
File: `d1'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 402655089 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:51.424531338 +0530
Modify: 2012-10-04 12:10:51.424531338 +0530
Change: 2012-10-04 12:10:51.610007436 +0530
10. now bring all sub-volumes up and perform lookup from client.
Verify updated 'custom extended attributes',uid and gid, Access permission from client
11.verify access permission from sub-volume which was down previously
[root@Rhs3 t1]# stat d2
File: `d2'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 134219638 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:51.700461206 +0530
Modify: 2012-10-04 12:10:51.700461206 +0530
Change: 2012-10-04 12:10:51.759989493 +0530
verify gid and uid on sub-volume which was down previously
[root@Rhs3 t1]# stat d1
File: `d1'
Size: 6 Blocks: 8 IO Block: 4096 directory
Device: fc05h/64517d Inode: 402655089 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-10-04 12:10:51.424531338 +0530
Modify: 2012-10-04 12:10:51.424531338 +0530
Change: 2012-10-04 12:10:51.610007436 +0530
Verify custom extended attributes for that directory.
server3:-
[root@Rhs3 t1]# getfattr -n user.foo d1
d1: user.foo: No such attribute
Actual results:
mount point shows modified values but on subvolumes values are not consistent
Expected results:
Once sub-volume is up it should update values for 'custom extended attributes',uid and gid, Access permission (for directories).
Additional info:
--- Additional comment from RHEL Product and Program Management on 2012-10-04 17:55:42 MVT ---
Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.
--- Additional comment from shishir gowda on 2012-10-19 09:21:57 MVT ---
There are multiple issues in this bug.
1. For user related xattrs, we can not handle healing as in dht, we would not be able to identify the correct copy. The work around for this is a subsequent setxattr for the same key, which will fix the xattr mis-match
2. UID/GID: A fix is in progress (862967)
3. mis-matching permission: Will investigate it respond back
--- Additional comment from shishir gowda on 2012-10-23 14:07:48 MVT ---
Fix @ https://code.engineering.redhat.com/gerrit/#/c/150/
--- Additional comment from Amar Tumballi on 2013-02-15 17:11:59 MVT ---
https://code.engineering.redhat.com/gerrit/#/c/1895/
--- Additional comment from Rachana Patel on 2013-03-19 18:09:46 MVT ---
(In reply to comment #2)
> There are multiple issues in this bug.
> 1. For user related xattrs, we can not handle healing as in dht, we would
> not be able to identify the correct copy. The work around for this is a
> subsequent setxattr for the same key, which will fix the xattr mis-match
>
> 2. UID/GID: A fix is in progress (862967)
> 3. mis-matching permission: Will investigate it respond back
1. If it is the case then, it should get documented.
2. It is dependent of bug 86296 and for that buf fixed in version is glusterfs-3.4.0qa5. So the same fix is available in latest build?
3. Could you please update on 3rd issue? what is the decision?
--- Additional comment from Scott Haines on 2013-09-27 22:07:27 MVT ---
Targeting for 3.0.0 (Denali) release.
--- Additional comment from errata-xmlrpc on 2014-04-10 05:20:34 MVT ---
This bug has been dropped from advisory RHEA-2014:17485 by Scott Haines (shaines)
--- Additional comment from Susant Kumar Palai on 2014-05-22 11:04:45 MVT ---
Here is a observation on the side effect of the patch in upstream : http://review.gluster.org/#/c/6983/. This patch works for all cases except one corner case.
Currently we take the "permission info" for healing from a brick only if it has a layout xattr. Let say we add a new brick to a volume(newly added brick will not have a layout xattr) and all except the newly added brick goes down. Then we change the permission of the root directory. Now only the new brick has witnessed the new permission. If we bring all the bricks up, the older permission will be healed across all bricks as we dont take permission info if a brick does not have layout xattr.
Here is the demo:
brick1 brick
permission(initial) 755 755
t0 UP ADDED BRICK
t1 CHILD_DOWN
t2 CHANGE PERMISSION LET SAY
777 on mount point
t3 CHILD_UP
t4 Heal 755 to
all bricks
as only this brick
has layout xattr
Final Permission after healing
t5 755 755 -----> should have 777
$ Why not having a layout xattr for root of the brick for a newly added brick helps ?
# If we assign a layout on root for the newly added brick, then as it will have the latest ctime it may corrupt the permission for the volume.
example:
brick1 brick2 brick3
t0 777(perm*) 777(perm*) Added as a brick
will have perm*
755 by default
t1 If it has layout
& higher ctime, 755
will be healed for all
Final Permission will be 755 instead of 777(bad)
t2 755 755 755
Hence, creating a zeroed layout for root will create the above problem.
$ Why healing on revalidate path is choosen ?
# The reason we don't do metadata healing on fresh lookup path is, because directory selfheal is carried out on lookup path. Hence, once we do selfheal of directory on fresh lookup, we follow revalidate_cbk. And we will not be able to heal permission for that directory. So healing on the revalidate path is choosen.
--- Additional comment from RHEL Product and Program Management on 2014-05-26 14:01:14 MVT ---
This bug report previously had all acks and release flag approved.
However since at least one of its acks has been changed, the
release flag has been reset to ? by the bugbot (pm-rhel). The
ack needs to become approved before the release flag can become
approved again.
--- Additional comment from Scott Haines on 2014-06-08 23:35:12 MVT ---
Per engineering management on 06/06/2014, moving back to rhs-future backlog.
--- Additional comment from errata-xmlrpc on 2014-06-20 14:20:27 MVT ---
This bug has been dropped from advisory RHEA-2014:17485 by Vivek Agarwal (vagarwal)
--- Additional comment from Nagaprasad Sathyanarayana on 2015-03-26 17:32:49 MVT ---
After having triaged, it was agreed by all leads that this BZ can not be fixed for 3.1.0 release.
--- Additional comment from John Skeoch on 2015-04-20 05:22:54 MVT ---
User racpatel's account has been closed
--- Additional comment from John Skeoch on 2015-04-20 05:25:23 MVT ---
User racpatel's account has been closed
--- Additional comment from Red Hat Bugzilla Rules Engine on 2015-11-27 05:15:34 EST ---
This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'.
If this bug should be proposed for a different release, please manually change the proposed release flag.
--- Additional comment from Atin Mukherjee on 2017-02-23 09:11:17 EST ---
upstream patch : http://review.gluster.org/15468
--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-03-02 02:27:03 EST ---
Since this bug has been approved for the RHGS 3.3.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.3.0+', and through the Internal Whiteboard entry of '3.3.0', the Target Release is being automatically set to 'RHGS 3.3.0'
--- Additional comment from Atin Mukherjee on 2017-06-28 06:36:48 EDT ---
At rhgs-3.3.0 pre devel freeze status meeting all the stakeholders agreed to defer this bug from rhgs-3.3.0 release. More details at http://post-office.corp.redhat.com/archives/gluster-storage-release-team/2017-June/msg00123.html
REVIEW: https://review.gluster.org/19157 (cluster/dht : User xattrs are not healed after brick stop/start) posted (#1) for review on release-3.12 by MOHIT AGRAWAL
Fix has been included in 3.13 and upward versions of GlusterFS releases. This bug is getting closed, as the patch that fixes this issue is large, and does not qualify to be backported to a stable release.