1590774 – [GSS] gsyncd worker crashed in syncdutils with "OSError: [Errno 22] Invalid argument"

Bug 1590774 - [GSS] gsyncd worker crashed in syncdutils with "OSError: [Errno 22] Invalid argument"

Summary: [GSS] gsyncd worker crashed in syncdutils with "OSError: [Errno 22] Invalid a...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.3.1 Async
Assignee:	Kotresh HR
QA Contact:	Rochelle
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-06-13 11:37 UTC by Sonal
Modified:	2021-12-10 16:21 UTC (History)
CC List:	16 users (show)
Fixed In Version:	glusterfs-3.8.4-54.15
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-07-19 06:00:07 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:2222	0	None	None	None	2018-07-19 06:01:41 UTC

Description Sonal 2018-06-13 11:37:25 UTC

Description of problem:
gsyncd worker crashed in syncdutils with "OSError: [Errno 22] Invalid argument"

Observing the following logs:

[2018-05-22 11:59:52.196463] I [master(/rhgs/brick1/data):83:gmaster_builder] <top>: setting up xsync change detection mode
[2018-05-22 11:59:52.197062] I [master(/rhgs/brick1/data):369:__init__] _GMaster: using 'rsync' as the sync engine
[2018-05-22 11:59:52.197985] I [master(/rhgs/brick1/data):83:gmaster_builder] <top>: setting up changelog change detection mode
[2018-05-22 11:59:52.198193] I [master(/rhgs/brick1/data):369:__init__] _GMaster: using 'rsync' as the sync engine
[2018-05-22 11:59:52.198868] I [master(/rhgs/brick1/data):83:gmaster_builder] <top>: setting up changeloghistory change detection mode
[2018-05-22 11:59:52.199088] I [master(/rhgs/brick1/data):369:__init__] _GMaster: using 'rsync' as the sync engine
[2018-05-22 11:59:54.232836] I [monitor(monitor):274:monitor] Monitor: ------------------------------------------------------------
[2018-05-22 11:59:54.233115] I [monitor(monitor):275:monitor] Monitor: starting gsyncd worker
[2018-05-22 11:59:54.273042] I [master(/rhgs/brick1/data):1253:register] _GMaster: xsync temp directory: /var/lib/misc/glusterfsd/upprodvarum/ssh%3A%2F%2Fgeoaccount%4010.127.28.25%3Agluster%3A%2F%2F127.0.0.1%3Aupprodvarum_rep/19b6ede62a50cc554027d6a4416c4fef/xsync
[2018-05-22 11:59:54.273315] I [resource(/rhgs/brick1/data):1533:service_loop] GLUSTER: Register time: 1526990394
[2018-05-22 11:59:54.276341] I [master(/rhgs/brick1/data):512:crawlwrap] _GMaster: primary master with volume id 2fbee41f-9473-47c4-83f7-12b617ef5a4b ...
[2018-05-22 11:59:54.286515] I [master(/rhgs/brick1/data):521:crawlwrap] _GMaster: crawl interval: 1 seconds
[2018-05-22 11:59:54.289128] I [master(/rhgs/brick1/data):468:mgmt_lock] _GMaster: Got lock : /rhgs/brick1/data : Becoming ACTIVE
[2018-05-22 11:59:54.294075] I [master(/rhgs/brick1/data):1167:crawl] _GMaster: starting history crawl... turns: 1, stime: (1512658469, 0), etime: 1526990394
[2018-05-22 11:59:54.346426] I [gsyncd(/rhgs/brick2/data):747:main_i] <top>: syncing: gluster://localhost:upprodvarum -> ssh://geoaccount@gluster11:gluster://localhost:upprodvarum_rep
[2018-05-22 11:59:54.346404] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining...
[2018-05-22 11:59:55.316481] I [master(/rhgs/brick1/data):1196:crawl] _GMaster: slave's time: (1512658469, 0)
[2018-05-22 11:59:57.385174] E [syncdutils(/rhgs/brick1/data):296:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 204, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 757, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1539, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 573, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1205, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1111, in changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 994, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 887, in process_change
    rl = errno_wrap(os.readlink, [en], [ENOENT], [ESTALE])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 495, in errno_wrap
    return call(*arg)
OSError: [Errno 22] Invalid argument: '.gfid/b2464acd-855c-42f0-8a4a-f8dffad6cae9/269

Version-Release number of selected component (if applicable):
RHGS 3.2 on RHEL-7 Async


Actual results:
Worker crash in syncdutils causing session to be in faulty state.

Expected results:
Worker should not crash

Additional info:
Package Version : glusterfs-3.8.4-18.el7rhgs.x86_64

Comment 34 Bipin Kunal 2018-07-12 10:15:51 UTC

Removing stale needinfo on me.

Comment 43 errata-xmlrpc 2018-07-19 06:00:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2222

Note You need to log in before you can comment on or make changes to this bug.