+++ This bug was initially created as a clone of Bug #1584098 +++ +++ This bug was initially created as a clone of Bug #1582119 +++ Description of problem: ======================= 'custom extended attributes' set on a directory are not healed after bringing back the down sub-volumes. Client: ======= getfattr -n user.foo c # file: c user.foo="bar1" Backend bricks: =============== [root@dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m . /bricks/brick0/distrepx3-b0/c getfattr: Removing leading '/' from absolute path names # file: bricks/brick0/distrepx3-b0/c security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.gfid=0x326aa2338e3846afa3006e873b313416 trusted.glusterfs.dht=0x00000000000000007ffffffc9ffffffa [root@dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m . /bricks/brick1/distrepx3-b1/c getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/distrepx3-b1/c security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.gfid=0x326aa2338e3846afa3006e873b313416 trusted.glusterfs.dht=0x00000000000000009ffffffbbffffff9 Version-Release number of selected component (if applicable): 3.12.2-11.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: ==================== 1) Create a distributed-replicated volume and start it. 2) FUSE mount it on a client. 3) From client create few directories of depth 3. 4) Now, bring down few dht sub-vols using gf_attach command. (I brought down 2 dht sub-vol and 1 brick in another replica pair) 5) Make metadata changes to the directories. like uid, gid, perms, setxattr 6) Bring back the down sub-vols. 7) Check all the bricks for consistency. Actual results: ================ 'custom extended attributes' are not healed after bringing back the down sub-volumes Expected results: ================= No inconsistencies. --- Additional comment from Red Hat Bugzilla Rules Engine on 2018-05-24 05:31:08 EDT --- This bug is automatically being proposed for the release of Red Hat Gluster Storage 3 under active development and open for bug fixes, by setting the release flag 'rhgs‑3.4.0' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Prasad Desala on 2018-05-24 05:36:37 EDT --- sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/ Collected xattr from all the backend bricks @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/ volname: distrepx3 protocol: FUSE [root@dhcp42-143 /]# gluster v info distrepx3 Volume Name: distrepx3 Type: Distributed-Replicate Volume ID: 4bd32cc9-0020-400f-9b48-af4bb50210d2 Status: Started Snapshot Count: 0 Number of Bricks: 8 x 3 = 24 Transport-type: tcp Bricks: Brick1: 10.70.42.143:/bricks/brick0/distrepx3-b0 Brick2: 10.70.43.41:/bricks/brick0/distrepx3-b0 Brick3: 10.70.43.35:/bricks/brick0/distrepx3-b0 Brick4: 10.70.43.37:/bricks/brick0/distrepx3-b0 Brick5: 10.70.42.143:/bricks/brick1/distrepx3-b1 Brick6: 10.70.43.41:/bricks/brick1/distrepx3-b1 Brick7: 10.70.43.35:/bricks/brick1/distrepx3-b1 Brick8: 10.70.43.37:/bricks/brick1/distrepx3-b1 Brick9: 10.70.42.143:/bricks/brick2/distrepx3-b2 Brick10: 10.70.43.41:/bricks/brick2/distrepx3-b2 Brick11: 10.70.43.35:/bricks/brick2/distrepx3-b2 Brick12: 10.70.43.37:/bricks/brick2/distrepx3-b2 Brick13: 10.70.42.143:/bricks/brick3/distrepx3-b3 Brick14: 10.70.43.41:/bricks/brick3/distrepx3-b3 Brick15: 10.70.43.35:/bricks/brick3/distrepx3-b3 Brick16: 10.70.43.37:/bricks/brick3/distrepx3-b3 Brick17: 10.70.42.143:/bricks/brick4/distrepx3-b4 Brick18: 10.70.43.41:/bricks/brick4/distrepx3-b4 Brick19: 10.70.43.35:/bricks/brick4/distrepx3-b4 Brick20: 10.70.43.37:/bricks/brick4/distrepx3-b4 Brick21: 10.70.42.143:/bricks/brick5/distrepx3-b5 Brick22: 10.70.43.41:/bricks/brick5/distrepx3-b5 Brick23: 10.70.43.35:/bricks/brick5/distrepx3-b5 Brick24: 10.70.43.37:/bricks/brick5/distrepx3-b5 Options Reconfigured: diagnostics.client-log-level: TRACE transport.address-family: inet nfs.disable: on performance.client-io-threads: off Brought down bricks: ==================== [root@dhcp42-143 ~]# gluster v status Status of volume: distrepx3 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.42.143:/bricks/brick0/distrepx3 -b0 N/A N/A N N/A Brick 10.70.43.41:/bricks/brick0/distrepx3- b0 N/A N/A N N/A Brick 10.70.43.35:/bricks/brick0/distrepx3- b0 N/A N/A N N/A Brick 10.70.43.37:/bricks/brick0/distrepx3- b0 N/A N/A N N/A Brick 10.70.42.143:/bricks/brick1/distrepx3 -b1 N/A N/A N N/A Brick 10.70.43.41:/bricks/brick1/distrepx3- b1 N/A N/A N N/A Brick 10.70.43.35:/bricks/brick1/distrepx3- b1 N/A N/A N N/A Brick 10.70.43.37:/bricks/brick1/distrepx3- b1 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick2/distrepx3 -b2 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick2/distrepx3- b2 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick2/distrepx3- b2 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick2/distrepx3- b2 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick3/distrepx3 -b3 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick3/distrepx3- b3 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick3/distrepx3- b3 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick3/distrepx3- b3 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick4/distrepx3 -b4 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick4/distrepx3- b4 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick4/distrepx3- b4 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick4/distrepx3- b4 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick5/distrepx3 -b5 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick5/distrepx3- b5 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick5/distrepx3- b5 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick5/distrepx3- b5 49153 0 Y 31268 Self-heal Daemon on localhost N/A N/A Y 14510 Self-heal Daemon on 10.70.43.41 N/A N/A Y 21427 Self-heal Daemon on 10.70.43.37 N/A N/A Y 6899 Self-heal Daemon on 10.70.43.35 N/A N/A Y 9416 Task Status of Volume distrepx3 ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp42-143 /]# gluster v status distrepx3 Status of volume: distrepx3 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.42.143:/bricks/brick0/distrepx3 -b0 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick0/distrepx3- b0 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick0/distrepx3- b0 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick0/distrepx3- b0 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick1/distrepx3 -b1 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick1/distrepx3- b1 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick1/distrepx3- b1 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick1/distrepx3- b1 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick2/distrepx3 -b2 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick2/distrepx3- b2 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick2/distrepx3- b2 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick2/distrepx3- b2 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick3/distrepx3 -b3 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick3/distrepx3- b3 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick3/distrepx3- b3 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick3/distrepx3- b3 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick4/distrepx3 -b4 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick4/distrepx3- b4 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick4/distrepx3- b4 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick4/distrepx3- b4 49153 0 Y 31268 Brick 10.70.42.143:/bricks/brick5/distrepx3 -b5 49152 0 Y 14419 Brick 10.70.43.41:/bricks/brick5/distrepx3- b5 49154 0 Y 21299 Brick 10.70.43.35:/bricks/brick5/distrepx3- b5 49155 0 Y 1609 Brick 10.70.43.37:/bricks/brick5/distrepx3- b5 49153 0 Y 31268 Self-heal Daemon on localhost N/A N/A Y 5898 Self-heal Daemon on 10.70.43.41 N/A N/A Y 13498 Self-heal Daemon on 10.70.43.35 N/A N/A Y 809 Self-heal Daemon on 10.70.43.37 N/A N/A Y 30996 Task Status of Volume distrepx3 ------------------------------------------------------------------------------ There are no active volume tasks Mount point output: ================== [root@dhcp37-110 distrepx3_new]# ls -lRt .: total 24 drwxr-xr-x. 3 root root 4096 May 24 11:29 f drwxrwxrwx. 3 root root 4096 May 24 11:29 e drwxr-xr-x. 3 user1 user1 4096 May 24 11:28 d drwxr-xr-x. 3 user1 root 4096 May 24 11:28 c drwxr-xr-x. 3 root user1 4096 May 24 11:28 b drwxr-xr-x. 3 root root 4096 May 24 11:28 a ./f: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:29 i ./f/i: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:29 j ./f/i/j: total 4 drwxr-xr-x. 2 root root 4096 May 24 11:29 k ./f/i/j/k: total 0 ./e: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:29 f ./e/f: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:29 g ./e/f/g: total 4 drwxr-xr-x. 2 root root 4096 May 24 11:29 h ./e/f/g/h: total 0 ./d: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 e ./d/e: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:29 f ./d/e/f: total 4 drwxr-xr-x. 2 root root 4096 May 24 11:29 g ./d/e/f/g: total 0 ./c: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 d ./c/d: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 e ./c/d/e: total 4 drwxr-xr-x. 2 root root 4096 May 24 11:28 f ./c/d/e/f: total 0 ./b: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 c ./b/c: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 d ./b/c/d: total 4 drwxr-xr-x. 2 root root 4096 May 24 11:28 e ./b/c/d/e: total 0 ./a: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 b ./a/b: total 4 drwxr-xr-x. 3 root root 4096 May 24 11:28 c ./a/b/c: total 4 drwxr-xr-x. 2 root root 4096 May 24 11:28 d ./a/b/c/d: total 0 --- Additional comment from Mohit Agrawal on 2018-05-29 04:56:08 EDT --- Hi, I have analyzed the root cause why xattr was not healed. I have found internal xattr(MDS) was not updated on one afr children because the children were down at the time of updating xattr. Ideally, after starting down subvols afr should heal the same. If afr returns MDS value 0 to DHT from wrong subvol, in that case, DHT will not take any action to heal xattr.I have tried to reproduce the same, I am able to reproduce it and discussed the same with Kartik also so I am changing component from DHT to AFR. http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.37/37_b1 file: bricks/brick1/distrepx3-b1/c security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.distrepx3-client-6=0x000000000000000000000000 trusted.gfid=0x326aa2338e3846afa3006e873b313416 trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8 trusted.glusterfs.dht.mds=0xfffffffd user.foo=0x62617231 user.foo1=0x6261723 http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.35/35_b1 # file: bricks/brick1/distrepx3-b1/c security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.gfid=0x326aa2338e3846afa3006e873b313416 trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8 trusted.glusterfs.dht.mds=0x00000000 user.foo=0x62617231 user.foo1=0x62617232 http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.42.143/143.bug_b2 # file: bricks/brick2/distrepx3-b2/c security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.distrepx3-client-6=0x000000000000000000000000 trusted.gfid=0x326aa2338e3846afa3006e873b313416 trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8 trusted.glusterfs.dht.mds=0xfffffffd user.foo=0x62617231 user.foo1=0x62617232 Regards Mohit Agrawal --- Additional comment from Ravishankar N on 2018-05-29 04:59:59 EDT --- Assigning to Karthik as he's taking a look at this. --- Additional comment from Mohit Agrawal on 2018-05-30 05:33:29 EDT --- Hi, I have check further same with afr team, the MDS internal xattr was not healed after start subvol because in posix we are ignoring it so I am assigning to myself to resolve the same. Regards Mohit Agrawal --- Additional comment from Worker Ant on 2018-05-30 05:44:53 EDT --- REVIEW: https://review.gluster.org/20102 (dht: Delete MDS internal xattr from dict in dht_getxattr_cbk) posted (#1) for review on master by MOHIT AGRAWAL --- Additional comment from Worker Ant on 2018-06-02 23:22:36 EDT --- COMMIT: https://review.gluster.org/20102 committed in master by "Raghavendra G" <rgowdapp> with a commit message- dht: Delete MDS internal xattr from dict in dht_getxattr_cbk Problem: At the time of fetching xattr to heal xattr by afr it is not able to fetch xattr because posix_getxattr has a check to ignore if xattr name is MDS Solution: To ignore same xattr update a check in dht_getxattr_cbk instead of having a check in posix_getxattr BUG: 1584098 Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc fixes: bz#1584098 Signed-off-by: Mohit Agrawal <moagrawa>
REVIEW: https://review.gluster.org/20615 (dht: Delete MDS internal xattr from dict in dht_getxattr_cbk) posted (#1) for review on release-4.1 by Kotresh HR
COMMIT: https://review.gluster.org/20615 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- dht: Delete MDS internal xattr from dict in dht_getxattr_cbk Problem: At the time of fetching xattr to heal xattr by afr it is not able to fetch xattr because posix_getxattr has a check to ignore if xattr name is MDS Solution: To ignore same xattr update a check in dht_getxattr_cbk instead of having a check in posix_getxattr Backport of: > BUG: 1584098 > Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc > Signed-off-by: Mohit Agrawal <moagrawa> Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc fixes: bz#1611116 Signed-off-by: Mohit Agrawal <moagrawa>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.3, please open a new bug report. glusterfs-4.1.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-August/000111.html [2] https://www.gluster.org/pipermail/gluster-users/