Bug 1611116 - 'custom extended attributes' set on a directory are not healed after bringing back the down sub-volumes
Summary: 'custom extended attributes' set on a directory are not healed after bringing...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 4.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Kotresh HR
QA Contact:
URL:
Whiteboard:
Depends On: 1582119 1584098
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-02 05:49 UTC by Kotresh HR
Modified: 2018-08-29 12:45 UTC (History)
7 users (show)

Fixed In Version: glusterfs-4.1.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1584098
Environment:
Last Closed: 2018-08-29 12:45:07 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Kotresh HR 2018-08-02 05:49:59 UTC
+++ This bug was initially created as a clone of Bug #1584098 +++

+++ This bug was initially created as a clone of Bug #1582119 +++

Description of problem:
=======================
'custom extended attributes' set on a directory are not healed after bringing back the down sub-volumes.

Client:
=======
getfattr -n user.foo c
# file: c
user.foo="bar1"

Backend bricks:
===============
[root@dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m . /bricks/brick0/distrepx3-b0/c
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/distrepx3-b0/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x00000000000000007ffffffc9ffffffa

[root@dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m . /bricks/brick1/distrepx3-b1/c
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x00000000000000009ffffffbbffffff9


Version-Release number of selected component (if applicable):
3.12.2-11.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
====================
1) Create a distributed-replicated volume and start it.
2) FUSE mount it on a client.
3) From client create few directories of depth 3.
4) Now, bring down few dht sub-vols using gf_attach command. (I brought down 2 dht sub-vol and 1 brick in another replica pair)
5) Make metadata changes to the directories. like uid, gid, perms, setxattr
6) Bring back the down sub-vols.
7) Check all the bricks for consistency.

Actual results:
================
'custom extended attributes' are not healed after bringing back the down sub-volumes

Expected results:
=================
No inconsistencies.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-05-24 05:31:08 EDT ---

This bug is automatically being proposed for the release of Red Hat Gluster Storage 3 under active development and open for bug fixes, by setting the release flag 'rhgs‑3.4.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Prasad Desala on 2018-05-24 05:36:37 EDT ---

sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/
Collected xattr from all the backend bricks @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/

volname: distrepx3
protocol: FUSE

[root@dhcp42-143 /]# gluster v info distrepx3
 
Volume Name: distrepx3
Type: Distributed-Replicate
Volume ID: 4bd32cc9-0020-400f-9b48-af4bb50210d2
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x 3 = 24
Transport-type: tcp
Bricks:
Brick1: 10.70.42.143:/bricks/brick0/distrepx3-b0
Brick2: 10.70.43.41:/bricks/brick0/distrepx3-b0
Brick3: 10.70.43.35:/bricks/brick0/distrepx3-b0
Brick4: 10.70.43.37:/bricks/brick0/distrepx3-b0
Brick5: 10.70.42.143:/bricks/brick1/distrepx3-b1
Brick6: 10.70.43.41:/bricks/brick1/distrepx3-b1
Brick7: 10.70.43.35:/bricks/brick1/distrepx3-b1
Brick8: 10.70.43.37:/bricks/brick1/distrepx3-b1
Brick9: 10.70.42.143:/bricks/brick2/distrepx3-b2
Brick10: 10.70.43.41:/bricks/brick2/distrepx3-b2
Brick11: 10.70.43.35:/bricks/brick2/distrepx3-b2
Brick12: 10.70.43.37:/bricks/brick2/distrepx3-b2
Brick13: 10.70.42.143:/bricks/brick3/distrepx3-b3
Brick14: 10.70.43.41:/bricks/brick3/distrepx3-b3
Brick15: 10.70.43.35:/bricks/brick3/distrepx3-b3
Brick16: 10.70.43.37:/bricks/brick3/distrepx3-b3
Brick17: 10.70.42.143:/bricks/brick4/distrepx3-b4
Brick18: 10.70.43.41:/bricks/brick4/distrepx3-b4
Brick19: 10.70.43.35:/bricks/brick4/distrepx3-b4
Brick20: 10.70.43.37:/bricks/brick4/distrepx3-b4
Brick21: 10.70.42.143:/bricks/brick5/distrepx3-b5
Brick22: 10.70.43.41:/bricks/brick5/distrepx3-b5
Brick23: 10.70.43.35:/bricks/brick5/distrepx3-b5
Brick24: 10.70.43.37:/bricks/brick5/distrepx3-b5
Options Reconfigured:
diagnostics.client-log-level: TRACE
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Brought down bricks:
====================
[root@dhcp42-143 ~]# gluster v status
Status of volume: distrepx3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.143:/bricks/brick0/distrepx3
-b0                                         N/A       N/A        N       N/A  
Brick 10.70.43.41:/bricks/brick0/distrepx3-
b0                                          N/A       N/A        N       N/A  
Brick 10.70.43.35:/bricks/brick0/distrepx3-
b0                                          N/A       N/A        N       N/A  
Brick 10.70.43.37:/bricks/brick0/distrepx3-
b0                                          N/A       N/A        N       N/A  
Brick 10.70.42.143:/bricks/brick1/distrepx3
-b1                                         N/A       N/A        N       N/A  
Brick 10.70.43.41:/bricks/brick1/distrepx3-
b1                                          N/A       N/A        N       N/A  
Brick 10.70.43.35:/bricks/brick1/distrepx3-
b1                                          N/A       N/A        N       N/A  
Brick 10.70.43.37:/bricks/brick1/distrepx3-
b1                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick2/distrepx3
-b2                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick2/distrepx3-
b2                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick2/distrepx3-
b2                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick2/distrepx3-
b2                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick3/distrepx3
-b3                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick3/distrepx3-
b3                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick3/distrepx3-
b3                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick3/distrepx3-
b3                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick4/distrepx3
-b4                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick4/distrepx3-
b4                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick4/distrepx3-
b4                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick4/distrepx3-
b4                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick5/distrepx3
-b5                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick5/distrepx3-
b5                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick5/distrepx3-
b5                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick5/distrepx3-
b5                                          49153     0          Y       31268
Self-heal Daemon on localhost               N/A       N/A        Y       14510
Self-heal Daemon on 10.70.43.41             N/A       N/A        Y       21427
Self-heal Daemon on 10.70.43.37             N/A       N/A        Y       6899 
Self-heal Daemon on 10.70.43.35             N/A       N/A        Y       9416 
 
Task Status of Volume distrepx3
------------------------------------------------------------------------------
There are no active volume tasks


[root@dhcp42-143 /]# gluster v status distrepx3
Status of volume: distrepx3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.143:/bricks/brick0/distrepx3
-b0                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick0/distrepx3-
b0                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick0/distrepx3-
b0                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick0/distrepx3-
b0                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick1/distrepx3
-b1                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick1/distrepx3-
b1                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick1/distrepx3-
b1                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick1/distrepx3-
b1                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick2/distrepx3
-b2                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick2/distrepx3-
b2                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick2/distrepx3-
b2                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick2/distrepx3-
b2                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick3/distrepx3
-b3                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick3/distrepx3-
b3                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick3/distrepx3-
b3                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick3/distrepx3-
b3                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick4/distrepx3
-b4                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick4/distrepx3-
b4                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick4/distrepx3-
b4                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick4/distrepx3-
b4                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick5/distrepx3
-b5                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick5/distrepx3-
b5                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick5/distrepx3-
b5                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick5/distrepx3-
b5                                          49153     0          Y       31268
Self-heal Daemon on localhost               N/A       N/A        Y       5898 
Self-heal Daemon on 10.70.43.41             N/A       N/A        Y       13498
Self-heal Daemon on 10.70.43.35             N/A       N/A        Y       809  
Self-heal Daemon on 10.70.43.37             N/A       N/A        Y       30996
 
Task Status of Volume distrepx3
------------------------------------------------------------------------------
There are no active volume tasks
 
Mount point output:
==================
[root@dhcp37-110 distrepx3_new]# ls -lRt
.:
total 24
drwxr-xr-x. 3 root  root  4096 May 24 11:29 f
drwxrwxrwx. 3 root  root  4096 May 24 11:29 e
drwxr-xr-x. 3 user1 user1 4096 May 24 11:28 d
drwxr-xr-x. 3 user1 root  4096 May 24 11:28 c
drwxr-xr-x. 3 root  user1 4096 May 24 11:28 b
drwxr-xr-x. 3 root  root  4096 May 24 11:28 a

./f:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 i

./f/i:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 j

./f/i/j:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 k

./f/i/j/k:
total 0

./e:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 f

./e/f:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 g

./e/f/g:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 h

./e/f/g/h:
total 0

./d:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 e

./d/e:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 f

./d/e/f:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 g

./d/e/f/g:
total 0

./c:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 d

./c/d:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 e

./c/d/e:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 f

./c/d/e/f:
total 0

./b:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 c

./b/c:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 d

./b/c/d:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 e

./b/c/d/e:
total 0

./a:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 b

./a/b:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 c

./a/b/c:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 d

./a/b/c/d:
total 0

--- Additional comment from Mohit Agrawal on 2018-05-29 04:56:08 EDT ---

Hi,

  I have analyzed the root cause why xattr was not healed. I have found internal
  xattr(MDS) was not updated on one afr children because the children were down at the 
  time of updating xattr. Ideally, after starting down subvols afr should heal the same. 
  If afr returns MDS value 0 to DHT from wrong subvol, in that case, DHT will not take 
  any action to heal xattr.I have tried to reproduce the same, I am able to reproduce it 
  and discussed the same with Kartik also so I am changing component from DHT to AFR. 

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.37/37_b1
 file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.distrepx3-client-6=0x000000000000000000000000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0xfffffffd
user.foo=0x62617231
user.foo1=0x6261723


http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.35/35_b1
# file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0x00000000
user.foo=0x62617231
user.foo1=0x62617232


http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.42.143/143.bug_b2
# file: bricks/brick2/distrepx3-b2/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.distrepx3-client-6=0x000000000000000000000000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0xfffffffd
user.foo=0x62617231
user.foo1=0x62617232


Regards
Mohit Agrawal

--- Additional comment from Ravishankar N on 2018-05-29 04:59:59 EDT ---

Assigning to Karthik as he's taking a look at this.

--- Additional comment from Mohit Agrawal on 2018-05-30 05:33:29 EDT ---

Hi,

I have check further same with afr team, the MDS internal xattr was not healed after 
start subvol because in posix we are ignoring it so I am assigning to myself to resolve the same.

Regards
Mohit Agrawal

--- Additional comment from Worker Ant on 2018-05-30 05:44:53 EDT ---

REVIEW: https://review.gluster.org/20102 (dht: Delete MDS internal xattr from dict in dht_getxattr_cbk) posted (#1) for review on master by MOHIT AGRAWAL

--- Additional comment from Worker Ant on 2018-06-02 23:22:36 EDT ---

COMMIT: https://review.gluster.org/20102 committed in master by "Raghavendra G" <rgowdapp> with a commit message- dht: Delete MDS internal xattr from dict in dht_getxattr_cbk

Problem: At the time of fetching xattr to heal xattr by afr
         it is not able to fetch xattr because posix_getxattr
         has a check to ignore if xattr name is MDS

Solution: To ignore same xattr update a check in dht_getxattr_cbk
          instead of having a check in posix_getxattr

BUG: 1584098
Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc
fixes: bz#1584098
Signed-off-by: Mohit Agrawal <moagrawa>

Comment 1 Worker Ant 2018-08-02 06:02:41 UTC
REVIEW: https://review.gluster.org/20615 (dht: Delete MDS internal xattr from dict in dht_getxattr_cbk) posted (#1) for review on release-4.1 by Kotresh HR

Comment 2 Worker Ant 2018-08-16 14:32:04 UTC
COMMIT: https://review.gluster.org/20615 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- dht: Delete MDS internal xattr from dict in dht_getxattr_cbk

Problem: At the time of fetching xattr to heal xattr by afr
         it is not able to fetch xattr because posix_getxattr
         has a check to ignore if xattr name is MDS

Solution: To ignore same xattr update a check in dht_getxattr_cbk
          instead of having a check in posix_getxattr

Backport of:
 > BUG: 1584098
 > Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc
 > Signed-off-by: Mohit Agrawal <moagrawa>

Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc
fixes: bz#1611116
Signed-off-by: Mohit Agrawal <moagrawa>

Comment 3 Shyamsundar 2018-08-29 12:45:07 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.3, please open a new bug report.

glusterfs-4.1.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-August/000111.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.