Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 812287

Summary: Rebalance while geo-replication session is active b/w master and slave, causes nonuniform file sync on slave.
Product: [Community] GlusterFS Reporter: Vijaykumar Koppad <vkoppad>
Component: coreAssignee: Venky Shankar <vshankar>
Status: CLOSED CURRENTRELEASE QA Contact: Vijaykumar Koppad <vkoppad>
Severity: urgent Docs Contact:
Priority: medium    
Version: mainlineCC: aavati, bbandari, csaba, gluster-bugs, nsathyan, rwheeler
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.5.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-17 11:38:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vijaykumar Koppad 2012-04-13 10:05:06 UTC
Description of problem:
If their is an active geo-replication session, if we do rebalance on the master volume, or slave volume or both master and slave volume , results in non uniform file sync on the slave, non uninform means arequal of mount point doesn't match,  

Version-Release number of selected component (if applicable):Master 3.3.qa33


How reproducible:


Steps to Reproduce:
1.Start a geo-rep session b/w master and slave volume over ssh.
2.Do rebalance on master volume , or slave volume or both master and slave volume   simultaneously,  
3.check the md5dum of the both master and slave mount points 
  
Actual results: Checksum don't match.


Expected results: Checksum should match.

Comment 1 Anand Avati 2012-04-16 18:21:57 UTC
CHANGE: http://review.gluster.com/3144 (glusterd/rebalance: Start process with xlator option client-pid -3) merged in master by Vijay Bellur (vijay)

Comment 2 Vijaykumar Koppad 2012-04-17 13:32:35 UTC
xtime of one of the directory on all the bricks BEFORE REBALANCE


[root@gqac005 exportdir]# getfattr -d -m . -e hex m[1-4]/large1/
# file: m1/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d0a7100090a53
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff

# file: m2/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d0a7100090a53
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff

# file: m3/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d0a7100090a53
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe

# file: m4/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d0a7100090a53
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
#######################################################
xtime of one of the directory on all the bricks AFTER REBALANCE ######################################################


[root@gqac005 exportdir]# getfattr -d -m . -e hex m[1-4]/large1/
# file: m1/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d15310009dfd0
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

# file: m2/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d15310009deca
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

# file: m3/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d152f0005ca50
trusted.glusterfs.dht=0x00000001000000000000000055555554

# file: m4/large1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x74453f467b634f8494f800fc44a8614b
trusted.glusterfs.a4ce752e-5eef-4912-a729-96e2dc478722.xtime=0x4f8d152f0005cb6b
trusted.glusterfs.dht=0x00000001000000000000000055555554
###########################################################################

We can see clear change in the xtimes. Its altering the xtimes.

Comment 3 Anand Avati 2012-04-18 16:30:49 UTC
CHANGE: http://review.gluster.com/3180 (glusterd/rebalance: Start process with xlator option client-pid -3) merged in master by Vijay Bellur (vijay)

Comment 4 Vijaykumar Koppad 2012-05-25 11:37:54 UTC
################################################################################
xtimes of the backend bricks after the add-brick and before the rebalance 
################################################################################

[root@RHS2 exportdir]# getfattr -d -m . -e hex /exportdir/d*
getfattr: Removing leading '/' from absolute path names
# file: exportdir/d1   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.b42590ae-8ce9-4f55-a906-0bd12c52ea63.xtime=0x4fbf177100012e8e
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

# file: exportdir/d2   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.b42590ae-8ce9-4f55-a906-0bd12c52ea63.xtime=0x4fbf177100012ebd
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

# file: exportdir/d3   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

# file: exportdir/d4   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

################################################################################xtimes of the backend bricks after the the rebalance 
################################################################################

[root@RHS2 exportdir]# getfattr -d -m . -e hex /exportdir/d*
getfattr: Removing leading '/' from absolute path names
# file: exportdir/d1   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.b42590ae-8ce9-4f55-a906-0bd12c52ea63.xtime=0x4fbf17cd0002aae1
trusted.glusterfs.dht=0x0000000100000000000000003ffffffe
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

# file: exportdir/d2   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.b42590ae-8ce9-4f55-a906-0bd12c52ea63.xtime=0x4fbf17cd0002aae1
trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

# file: exportdir/d3   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.b42590ae-8ce9-4f55-a906-0bd12c52ea63.xtime=0x4fbf17cd0002aae1
trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63

# file: exportdir/d4   
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.b42590ae-8ce9-4f55-a906-0bd12c52ea63.xtime=0x4fbf17cd0002aae1
trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff
trusted.glusterfs.volume-id=0xb42590ae8ce94f55a9060bd12c52ea63


I saw this release-3.3. So making it again to assigned state.

Comment 5 shishir gowda 2012-06-21 08:32:20 UTC
xtime attrs can be more than 1 (master/slaves). We need a new getxattr key which will be handled by marker to return all the set xtime keys, and setxattr support in marker to set all these keys.
DHT self-heal on a dir will call getxattr, followed by a setxattr on this key.
To prevent abuse of this key, the frame->root->pid will be set to -3 for this stage of operation.

Comment 6 Junaid 2012-06-26 11:44:49 UTC
As mentioned by Shishir, a virtual xattr key named "trusted.glusterfs.heal-xtime" will be passed by dht as part of self-heal in a getxattr call. The marker translator on the server side will handle it specially by returning a special value in the format

   value = "trusted.glusterfs.vol-id1.xtime:v1,trusted.glusterfs.vol-id2.xtime:v2,.."

Then, when the dht sends a setxattr with the above mentioned key, the marker translator will perform multiple setxattrs by parsing the value(the format is mentioned above).

Comment 7 Anand Avati 2012-07-03 21:35:08 UTC
Is it not sufficient if marker exposes the xattrs to rebalance's special PID and DHT performs the setxattr with XATTR_CREATE flag?

Comment 8 shishir gowda 2012-07-04 03:02:02 UTC
There is a related bug 821710.
The issue crops up in dht's directory selfheal. This issue can crop up outside of rebalance process too. So we cant depend on the PID.

Comment 9 Amar Tumballi 2012-07-12 08:21:00 UTC
Need more thoughts on this. Don't think this issue will cause serious problems. Hence reducing the priority.

Comment 10 Junaid 2012-09-14 06:09:46 UTC
I had a discussion with Avati some time back. It was decided that it would be enough for marker to expose all the xattr's. There is no need to have a new key to get and set the xtime xattr's. Instead, dht should send getxattr(calling without any key get's all the xattr's on that file/directory), in reply to this call all the exposed xattrs will be sent back. Its simple now, translator which donot want to expose the xattr's can filter them.

Comment 11 Amar Tumballi 2013-02-18 04:40:51 UTC
any update on this? should I be marking this bug as invalid? please update with latest findings/RCA etc.

Comment 12 Venky Shankar 2013-03-14 06:16:24 UTC
Vijaykuamr,

Could you please run the test(s) again. As we discussed, there could be xtime updates during an ongoing rebalance operation but it should not result in md5 differences b/w master and slave.

Before checking the md5s make sure there are no active rsync jobs running.

Comment 13 Niels de Vos 2014-04-17 11:38:27 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user