1046604 – geo-replication fails with OSError when setting remote xtime

Bug 1046604 - geo-replication fails with OSError when setting remote xtime

Summary: geo-replication fails with OSError when setting remote xtime

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 2.1.2
Assignee:	Aravinda VK
QA Contact:	M S Vishwanath Bhat
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1073844
TreeView+	depends on / blocked

Reported:	2013-12-26 09:52 UTC by Aravinda VK
Modified:	2018-12-09 17:23 UTC (History)
CC List:	10 users (show)
Fixed In Version:	glusterfs-3.4.0.53rhs
Doc Type:	Bug Fix
Doc Text:	Previously, setting the remote xtime would fail due to a Python backtrace. This made the Geo-replication worker process to restart with 'faulty' status. With this fix, Python exceptions are not raised when setting remote xtime fails and the Geo-replication worker process works as expected.
Clone Of:
Clones:	1073844 (view as bug list)
Environment:
Last Closed:	2014-02-25 08:13:01 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:0208	0	normal	SHIPPED_LIVE	Red Hat Storage 2.1 enhancement and bug fix update #2	2014-02-25 12:20:30 UTC

Description Aravinda VK 2013-12-26 09:52:27 UTC

Description of problem:
I installed release 51 and started the geo-rep process and I am seeing:

[2013-12-20 15:22:52.211688] W [master(/data/master/dp-vol):253:regjob] _GMaster: Rsync: .gfid/90c5b4fc-b7a3-4d73-a2ac-1421878a8e3f [errcode: 23]
[2013-12-20 15:22:52.212217] W [master(/data/master/dp-vol):883:process] _GMaster: incomplete sync, retrying changelogs: XSYNC-CHANGELOG.1387578655
[2013-12-20 15:29:34.180546] I [master(/data/master/dp-vol):1138:crawl] _GMaster: processing xsync changelog /var/run/gluster/dp-vol/ssh%3A%2F%2Froot%4010.145.74.241%3Agluster%3A%2F%2F127.0.0.1%3Adp-vol1/c0e8e929978b0e1a0fa2511da0bdc805/xsync/XSYNC-CHANGELOG.1387578659
[2013-12-20 15:36:25.599109] E [repce(/data/master/dp-vol):188:__call__] RepceClient: call 8647:140093031302912:1387582585.55 (set_xtime_remote) failed on peer with OSError
[2013-12-20 15:36:25.599637] E [syncdutils(/data/master/dp-vol):207:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 150, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 540, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1156, in service_loop
    g1.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 473, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1142, in crawl
    self.upd_stime(item[1][1], item[1][0])
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 890, in upd_stime
    self.sendmark(path, stime)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 664, in sendmark
    self.set_slave_xtime(path, mark)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 149, in set_slave_xtime
    self.slave.server.set_xtime_remote(path, self.uuid, mark)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 204, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 189, in __call__
    raise res
OSError: [Errno 2] No such file or directory
[2013-12-20 15:36:25.630029] I [syncdutils(/data/master/dp-vol):159:finalize] <top>: exiting.
[2013-12-20 15:36:25.648134] I [monitor(monitor):81:set_state] Monitor: new state: faulty
[2013-12-20 15:36:35.659291] I [monitor(monitor):129:monitor] Monitor: ------------------------------------------------------------
[2013-12-20 15:36:35.659557] I [monitor(monitor):130:monitor] Monitor: starting gsyncd worker
[2013-12-20 15:36:35.835516] I [gsyncd(/data/master/dp-vol):530:main_i] <top>: syncing: gluster://localhost:dp-vol ->ssh://root.intuit.net:gluster://localhost:dp-vol1
[2013-12-20 15:36:39.495806] I [master(/data/master/dp-vol):58:gmaster_builder] <top>: setting up xsync change detection mode
[2013-12-20 15:36:39.496217] I [master(/data/master/dp-vol):363:__init__] _GMaster: using 'rsync' as the sync engine
[2013-12-20 15:36:39.497653] I [master(/data/master/dp-vol):58:gmaster_builder] <top>: setting up changelog change detection mode
[2013-12-20 15:36:39.497888] I [master(/data/master/dp-vol):363:__init__] _GMaster: using 'rsync' as the sync engine
[2013-12-20 15:36:39.499078] I [master(/data/master/dp-vol):1108:register] _GMaster: xsync temp directory: /var/run/gluster/dp-vol/ssh

Comment 2 Nagaprasad Sathyanarayana 2013-12-26 10:28:30 UTC

https://code.engineering.redhat.com/gerrit/#/c/17885/ -> U1
https://code.engineering.redhat.com/gerrit/#/c/17887/ -> U2

Comment 4 M S Vishwanath Bhat 2014-01-02 15:02:23 UTC

Aravinda: Do you have the steps to reproduce this? I tried with 51geo version and I couldn't reproduce it.

Comment 5 Venky Shankar 2014-01-07 12:29:42 UTC

MZ,

This was once observed in Neependra's setup and in Intuit. It was not reproducible after that. It's observed in midst of a normal sync operation.

Comment 6 M S Vishwanath Bhat 2014-01-13 06:33:49 UTC

I was not able to reproduce even after 2-3 tries. The behaviour was same with or without patch. Since I'm hitting the same issue, I will move to verified. Please re-open if seen again.

Tested in Version: glusterfs-3.4.0.53rhs-1.el6rhs.x86_64.rpm

Comment 7 Pavithra 2014-01-15 05:31:55 UTC

Aravinda,

Can you please verify if the edited doc text is technically correct?

Comment 8 Aravinda VK 2014-01-15 05:52:02 UTC

doc text looks good to me.

Comment 10 errata-xmlrpc 2014-02-25 08:13:01 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html

Note You need to log in before you can comment on or make changes to this bug.