Description of problem:
gsyncd worker crashed in syncdutils with "OSError: [Errno 22] Invalid argument"
Observing the following logs:
[2018-05-22 11:59:52.196463] I [master(/rhgs/brick1/data):83:gmaster_builder] <top>: setting up xsync change detection mode
[2018-05-22 11:59:52.197062] I [master(/rhgs/brick1/data):369:__init__] _GMaster: using 'rsync' as the sync engine
[2018-05-22 11:59:52.197985] I [master(/rhgs/brick1/data):83:gmaster_builder] <top>: setting up changelog change detection mode
[2018-05-22 11:59:52.198193] I [master(/rhgs/brick1/data):369:__init__] _GMaster: using 'rsync' as the sync engine
[2018-05-22 11:59:52.198868] I [master(/rhgs/brick1/data):83:gmaster_builder] <top>: setting up changeloghistory change detection mode
[2018-05-22 11:59:52.199088] I [master(/rhgs/brick1/data):369:__init__] _GMaster: using 'rsync' as the sync engine
[2018-05-22 11:59:54.232836] I [monitor(monitor):274:monitor] Monitor: ------------------------------------------------------------
[2018-05-22 11:59:54.233115] I [monitor(monitor):275:monitor] Monitor: starting gsyncd worker
[2018-05-22 11:59:54.273042] I [master(/rhgs/brick1/data):1253:register] _GMaster: xsync temp directory: /var/lib/misc/glusterfsd/upprodvarum/ssh%3A%2F%2Fgeoaccount%4010.127.28.25%3Agluster%3A%2F%2F127.0.0.1%3Aupprodvarum_rep/19b6ede62a50cc554027d6a4416c4fef/xsync
[2018-05-22 11:59:54.273315] I [resource(/rhgs/brick1/data):1533:service_loop] GLUSTER: Register time: 1526990394
[2018-05-22 11:59:54.276341] I [master(/rhgs/brick1/data):512:crawlwrap] _GMaster: primary master with volume id 2fbee41f-9473-47c4-83f7-12b617ef5a4b ...
[2018-05-22 11:59:54.286515] I [master(/rhgs/brick1/data):521:crawlwrap] _GMaster: crawl interval: 1 seconds
[2018-05-22 11:59:54.289128] I [master(/rhgs/brick1/data):468:mgmt_lock] _GMaster: Got lock : /rhgs/brick1/data : Becoming ACTIVE
[2018-05-22 11:59:54.294075] I [master(/rhgs/brick1/data):1167:crawl] _GMaster: starting history crawl... turns: 1, stime: (1512658469, 0), etime: 1526990394
[2018-05-22 11:59:54.346426] I [gsyncd(/rhgs/brick2/data):747:main_i] <top>: syncing: gluster://localhost:upprodvarum -> ssh://geoaccount@gluster11:gluster://localhost:upprodvarum_rep
[2018-05-22 11:59:54.346404] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining...
[2018-05-22 11:59:55.316481] I [master(/rhgs/brick1/data):1196:crawl] _GMaster: slave's time: (1512658469, 0)
[2018-05-22 11:59:57.385174] E [syncdutils(/rhgs/brick1/data):296:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 204, in main
main_i()
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 757, in main_i
local.service_loop(*[r for r in [remote] if r])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1539, in service_loop
g3.crawlwrap(oneshot=True)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 573, in crawlwrap
self.crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1205, in crawl
self.changelogs_batch_process(changes)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1111, in changelogs_batch_process
self.process(batch)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 994, in process
self.process_change(change, done, retry)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 887, in process_change
rl = errno_wrap(os.readlink, [en], [ENOENT], [ESTALE])
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 495, in errno_wrap
return call(*arg)
OSError: [Errno 22] Invalid argument: '.gfid/b2464acd-855c-42f0-8a4a-f8dffad6cae9/269
Version-Release number of selected component (if applicable):
RHGS 3.2 on RHEL-7 Async
Actual results:
Worker crash in syncdutils causing session to be in faulty state.
Expected results:
Worker should not crash
Additional info:
Package Version : glusterfs-3.8.4-18.el7rhgs.x86_64
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2018:2222