1500433 – [geo-rep]: RSYNC throwing internal errors

Bug 1500433 - [geo-rep]: RSYNC throwing internal errors

Summary: [geo-rep]: RSYNC throwing internal errors

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	geo-replication
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Kotresh HR
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1476876
Blocks:	1502104
TreeView+	depends on / blocked

Reported:	2017-10-10 14:46 UTC by Kotresh HR
Modified:	2017-12-08 17:43 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.13.0
Clone Of:	1476876
Clones:	1502104 (view as bug list)
Environment:
Last Closed:	2017-12-08 17:43:44 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kotresh HR 2017-10-10 14:46:49 UTC

+++ This bug was initially created as a clone of Bug #1476876 +++

Description of problem:
=======================
Rsync throwing internal errors with  'rsync: get_xattr_data: lgetxattr"

[2017-07-31  09:46:14.352732] W [master(/rhs/brick3/b16):1067:process] _GMaster:  incomplete sync, retrying changelogs: CHANGELOG.1501494366
[2017-07-31  09:46:15.125684] E [resource(/rhs/brick2/b10):1044:rsync] SSH: SYNC  Error(Rsync): rsync: get_xattr_data:  lgetxattr(""/proc/3840/cwd/.gfid/00000000-0000-0000-0000-000000000001"","trusted.glusterfs.volume-mark.2d516aed-ad11-43cc-8741-32bfc7391b74",0)  failed: No data available (61)
[2017-07-31  09:46:15.126796] E [master(/rhs/brick2/b10):1046:process] _GMaster:  changelogs CHANGELOG.1501494366 could not be processed completely -  moving on...
[2017-07-31  09:46:15.132359] E [resource(/rhs/brick1/b4):1044:rsync] SSH: SYNC  Error(Rsync): rsync: get_xattr_data:  lgetxattr(""/proc/3838/cwd/.gfid/00000000-0000-0000-0000-000000000001"","trusted.glusterfs.volume-mark.2d516aed-ad11-43cc-8741-32bfc7391b74",0)  failed: No data available (61)
[2017-07-31  09:46:15.133415] E [master(/rhs/brick1/b4):1046:process] _GMaster:  changelogs CHANGELOG.1501494366 could not be processed completely -  moving on...
[2017-07-31  09:46:15.158014] W [master(/rhs/brick3/b16):1067:process] _GMaster:  incomplete sync, retrying changelogs: CHANGELOG.1501494366
[2017-07-31  09:46:16.12286] E [resource(/rhs/brick3/b16):1044:rsync] SSH: SYNC  Error(Rsync): rsync: get_xattr_data:  lgetxattr(""/proc/3839/cwd/.gfid/00000000-0000-0000-0000-000000000001"","trusted.glusterfs.volume-mark.2d516aed-ad11-43cc-8741-32bfc7391b74",0)  failed: No data available (61)
[2017-07-31  09:46:16.13156] E [master(/rhs/brick3/b16):1046:process] _GMaster:  changelogs CHANGELOG.1501494366 could not be processed completely -  moving on...
[2017-07-31 09:47:21.598099] I [master(/rhs/brick2/b10):1132:crawl] _GMaster: slave's time: (1501494365, 0)
[2017-07-31 09:47:21.616106] I [master(/rhs/brick1/b4):1132:crawl] _GMaster: slave's time: (1501494365, 0)

Version-Release number of selected component (if applicable):
==============================================================
mainline

Steps to Reproduce:
=====================
1.Create a 6 node master cluster and a 6-node slave cluster
2.Create a 9x2 DR master volume and slave volume
3.Create and start non-root geo-replication session
4. Mount the master and slave volume
5. Create data  on the master mount :

for i in {create,chmod,chown,chgrp,hardlink,symlink,truncate,rename}; do echo "------------------- This iteration is for fop $i -----------------" >> /root/result ; crefi --multi -n 5 -b 10 -d 10 --max=10k --min=5k --random -T 10 -t text --fop=$i /mnt/master/ 1>/dev/null 2>&1 ; sleep 10 ; echo "---Arequal Master for $i---" >> /root/result ; /root/arequal-checksum -p /mnt/master/ >> /root/result ; sleep 600 ;  echo "---Arequal Slave for $i---" >> /root/result ; /root/arequal-checksum -p /mnt/slave/ >> /root/result ; done

All fops are synced (create,chmod,chgrp,chown,hardlink,symlink,truncate,rename) 
 
How reproducible:
==============
Have seen this twice on non-root setup out of 4 trials.

Comment 1 Worker Ant 2017-10-10 14:48:22 UTC

REVIEW: https://review.gluster.org/18479 (geo-rep: Filter out volume-mark xattr) posted (#1) for review on master by Kotresh HR (khiremat)

Comment 2 Worker Ant 2017-10-13 16:26:09 UTC

COMMIT: https://review.gluster.org/18479 committed in master by Jeff Darcy (jeff.us) 
------
commit c64fd0d4b0ef313bb44aae68a376ec0c9ee8657a
Author: Kotresh HR <khiremat>
Date:   Tue Oct 10 10:27:01 2017 -0400

    geo-rep: Filter out volume-mark xattr
    
    The volume-mark xattr, maintained at brick root
    of slave volume is specific to geo-replication
    and should be filtered out for all other clients.
    It should also be filtered out from list getxattr
    from all mounts including geo-rep mount as it
    might cause rsync to read and set.
    
    Change-Id: If9eb5a3af18051083c853e70d93b2819e8eea222
    BUG: 1500433
    Signed-off-by: Kotresh HR <khiremat>

Comment 3 Shyamsundar 2017-12-08 17:43:44 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report.

glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.