1397022 – After replacing a data brick on a arbiter volume the brick which got replaced accuses itself.

Bug 1397022 - After replacing a data brick on a arbiter volume the brick which got replaced accuses itself.

Summary: After replacing a data brick on a arbiter volume the brick which got replaced...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	arbiter
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Ravishankar N
QA Contact:	Karan Sandha
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-11-21 12:27 UTC by RamaKasturi
Modified:	2023-09-14 03:34 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-11-13 12:57:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description RamaKasturi 2016-11-21 12:27:38 UTC

Description of problem:
While replacing a data brick in arbiter volume, once the brick is replaced i see that there are some pending entries from the replaced brick and upon checking the xattrs of the file listed in the heal info ouput of the replaced brick i see that it accuses itself. 

Version-Release number of selected component (if applicable):
glusterfs-3.8.4-5.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install HC
2. Create a arbiter volume with granual-entry-heal on
3. Now replace one of the data brick in the  vmstore volume.
4. No I/O running on the volume.

Actual results:
All the bricks including the brick which is replaced have entries which needs to be healed.

Expected results:
Brick which is just replaced should not have any entries listed.

Additional info:

Comment 2 RamaKasturi 2016-11-21 12:31:37 UTC

Here brick which is been replaced is 10.70.36.80:/rhgs/brick1/vmstore2

gluster volume info:
==========================================
[root@rhsqa-grafton1 ~]# gluster volume info vmstore
 
Volume Name: vmstore
Type: Replicate
Volume ID: 8f1770b2-fcfa-4b56-80da-e72a3548ece2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.79:/rhgs/brick1/vmstorenew1
Brick2: 10.70.36.80:/rhgs/brick1/vmstore2
Brick3: 10.70.36.81:/rhgs/brick1/vmstore (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
performance.strict-o-direct: on
network.ping-timeout: 30
user.cifs: off
cluster.granular-entry-heal: on

[root@rhsqa-grafton1 ~]# gluster volume status vmstore
Status of volume: vmstore
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.36.79:/rhgs/brick1/vmstorenew1  49160     0          Y       10365
Brick 10.70.36.80:/rhgs/brick1/vmstore2     49155     0          Y       28693
Brick 10.70.36.81:/rhgs/brick1/vmstore      49157     0          Y       16824
Self-heal Daemon on localhost               N/A       N/A        Y       22995
Self-heal Daemon on 10.70.36.81             N/A       N/A        Y       19747
Self-heal Daemon on 10.70.36.80             N/A       N/A        Y       28698
 
Task Status of Volume vmstore
------------------------------------------------------------------------------
There are no active volume tasks

getfattrs from all the nodes for the file which gets listed in heal info output for the replaced brick:

[root@rhsqa-grafton2 ~]# getfattr -d -m . -e hex /rhgs/brick1/vmstore2/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/vmstore2/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.vmstore-client-1=0x000000000000000300000000
trusted.gfid=0xe47576819666494e888c2961cd7f7b16
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

getfattr -d -m . -e hex /rhgs/brick1/vmstorenew1/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/vmstorenew1/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.vmstore-client-1=0x000000010000000400000001
trusted.gfid=0xe47576819666494e888c2961cd7f7b16
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

[root@rhsqa-grafton3 ~]# getfattr -d -m . -e hex /rhgs/brick1/vmstore/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/vmstore/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.vmstore-client-0=0x000000000000000000000000
trusted.afr.vmstore-client-1=0x000000010000000500000001
trusted.gfid=0xe47576819666494e888c2961cd7f7b16
trusted.glusterfs.dht=0x000000010000000000000000ffffffff


stat file on all three backend nodes:
==================================================

[root@rhsqa-grafton2 ~]# stat /rhgs/brick1/vmstore2/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
  File: ‘/rhgs/brick1/vmstore2/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058’
  Size: 49        	Blocks: 0          IO Block: 4096   directory
Device: fd0bh/64779d	Inode: 1342178870  Links: 2
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:glusterd_brick_t:s0
Access: 2016-11-21 17:41:04.304823424 +0530
Modify: 2016-11-21 17:41:57.563167332 +0530
Change: 2016-11-21 17:43:56.368933444 +0530
 Birth: -


[root@rhsqa-grafton1 ~]# stat /rhgs/brick1/vmstorenew1/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
  File: ‘/rhgs/brick1/vmstorenew1/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058’
  Size: 146       	Blocks: 0          IO Block: 4096   directory
Device: fd0bh/64779d	Inode: 1140852236  Links: 2
Access: (0755/drwxr-xr-x)  Uid: (   36/    vdsm)   Gid: (   36/     kvm)
Context: system_u:object_r:glusterd_brick_t:s0
Access: 2016-11-21 14:33:56.752987736 +0530
Modify: 2016-11-21 14:33:59.538007759 +0530
Change: 2016-11-21 17:44:56.367664067 +0530
 Birth: -


[root@rhsqa-grafton3 ~]# stat /rhgs/brick1/vmstore/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058
  File: ‘/rhgs/brick1/vmstore/5ba64954-f23a-4c48-b62c-06db29bf4891/images/4c8677e1-8d7e-41e9-b655-9c7c798f5058’
  Size: 146       	Blocks: 0          IO Block: 4096   directory
Device: fd0bh/64779d	Inode: 469762572   Links: 2
Access: (0755/drwxr-xr-x)  Uid: (   36/    vdsm)   Gid: (   36/     kvm)
Context: system_u:object_r:glusterd_brick_t:s0
Access: 2016-11-10 15:27:37.200960000 +0530
Modify: 2016-11-10 15:30:37.899157551 +0530
Change: 2016-11-21 17:44:56.368775440 +0530
 Birth: -

Comment 6 Ravishankar N 2017-05-10 11:41:50 UTC

Hi Kasturi, I recently sent a fix (https://code.engineering.redhat.com/gerrit/103750) which prevents self-blaming xattrs. Could you see if you are able to reproduce the issue in a newer release (glusterfs-3.8.4-23 or greater).

Comment 10 Red Hat Bugzilla 2023-09-14 03:34:47 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.