+++ This bug was initially created as a clone of Bug #1375094 +++ Description of problem: ======================= While running the automation snaity check which does "create, chmod, chown, chgrp, symlink, hardlink, rename, truncate, rm" during changelog, xsync and history crawl. Following worker crash was observed: [2016-09-11 13:52:43.422640] E [syncdutils(/bricks/brick1/master_brick5):276:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 306, in twrap tf(*aa) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1267, in Xsyncer self.Xcrawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1424, in Xcrawl self.Xcrawl(e, xtr_root) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1424, in Xcrawl self.Xcrawl(e, xtr_root) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1424, in Xcrawl self.Xcrawl(e, xtr_root) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1424, in Xcrawl self.Xcrawl(e, xtr_root) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1424, in Xcrawl self.Xcrawl(e, xtr_root) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1424, in Xcrawl self.Xcrawl(e, xtr_root) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1406, in Xcrawl gfid = self.master.server.gfid(e) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1414, in gfid return super(brickserver, cls).gfid(e) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 327, in ff return f(*a) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 369, in gfid buf = Xattr.lgetxattr(path, cls.GFID_XATTR, 16) File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 55, in lgetxattr return cls._query_xattr(path, siz, 'lgetxattr', attr) File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 47, in _query_xattr cls.raise_oserr() File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 37, in raise_oserr raise OSError(errn, os.strerror(errn)) OSError: [Errno 61] No data available [2016-09-11 13:52:43.428107] I [syncdutils(/bricks/brick1/master_brick5):220:finalize] <top>: exiting. Version-Release number of selected component (if applicable): ============================================================= mainline How reproducible: ================= Happened to see it once, while the same test suite is executed multiple times. Steps: Cant be very certain. But it is inbetween the following: 1. Perform rm -rf on master. Let it complete on master 2. Check for files between master and slave 3. File matches on Master and slave and arequal matches 4. Set the change_detector to xsync. It is between step 2 and 4 This was caught via automation health check which does the fops in changelog,xsync and history one after another. Slave Log at the same time: [2016-09-11 13:52:43.433715] I [fuse-bridge.c:5007:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-MvkqZP [2016-09-11 13:52:43.436595] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7fa62ba77dc5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7fa62d0ef915] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7fa62d0ef78b] ) 0-: received signum (15), shutting down [2016-09-11 13:52:43.436617] I [fuse-bridge.c:5714:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-MvkqZP'.
REVIEW: https://review.gluster.org/18445 (geo-rep: Add ENODATA to retry list on gfid getxattr) posted (#1) for review on master by Kotresh HR (khiremat)
REVIEW: https://review.gluster.org/18445 (geo-rep: Add ENODATA to retry list on gfid getxattr) posted (#2) for review on master by Kotresh HR (khiremat)
COMMIT: https://review.gluster.org/18445 committed in master by Aravinda VK (avishwan) ------ commit b56bdb34dafd1a87c5bbb2c9a75d1a088d82b1f4 Author: Kotresh HR <khiremat> Date: Fri Oct 6 22:42:43 2017 -0400 geo-rep: Add ENODATA to retry list on gfid getxattr During xsync crawl, worker occasionally crashed with ENODATA on getting gfid from backend. This is not persistent and is transient. Worker restart invovles re-processing of few entries in changenlogs. So adding ENODATA to retry list to avoid worker restart. Change-Id: Ib78d1e925c0a83c78746f28f7c79792a327dfd3e BUG: 1499391 Signed-off-by: Kotresh HR <khiremat>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/