Bug 1046604 - geo-replication fails with OSError when setting remote xtime
Summary: geo-replication fails with OSError when setting remote xtime
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: RHGS 2.1.2
Assignee: Aravinda VK
QA Contact: M S Vishwanath Bhat
URL:
Whiteboard:
Depends On:
Blocks: 1073844
TreeView+ depends on / blocked
 
Reported: 2013-12-26 09:52 UTC by Aravinda VK
Modified: 2018-12-09 17:23 UTC (History)
10 users (show)

Fixed In Version: glusterfs-3.4.0.53rhs
Doc Type: Bug Fix
Doc Text:
Previously, setting the remote xtime would fail due to a Python backtrace. This made the Geo-replication worker process to restart with 'faulty' status. With this fix, Python exceptions are not raised when setting remote xtime fails and the Geo-replication worker process works as expected.
Clone Of:
: 1073844 (view as bug list)
Environment:
Last Closed: 2014-02-25 08:13:01 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:0208 0 normal SHIPPED_LIVE Red Hat Storage 2.1 enhancement and bug fix update #2 2014-02-25 12:20:30 UTC

Description Aravinda VK 2013-12-26 09:52:27 UTC
Description of problem:
I installed release 51 and started the geo-rep process and I am seeing:

[2013-12-20 15:22:52.211688] W [master(/data/master/dp-vol):253:regjob] _GMaster: Rsync: .gfid/90c5b4fc-b7a3-4d73-a2ac-1421878a8e3f [errcode: 23]
[2013-12-20 15:22:52.212217] W [master(/data/master/dp-vol):883:process] _GMaster: incomplete sync, retrying changelogs: XSYNC-CHANGELOG.1387578655
[2013-12-20 15:29:34.180546] I [master(/data/master/dp-vol):1138:crawl] _GMaster: processing xsync changelog /var/run/gluster/dp-vol/ssh%3A%2F%2Froot%4010.145.74.241%3Agluster%3A%2F%2F127.0.0.1%3Adp-vol1/c0e8e929978b0e1a0fa2511da0bdc805/xsync/XSYNC-CHANGELOG.1387578659
[2013-12-20 15:36:25.599109] E [repce(/data/master/dp-vol):188:__call__] RepceClient: call 8647:140093031302912:1387582585.55 (set_xtime_remote) failed on peer with OSError
[2013-12-20 15:36:25.599637] E [syncdutils(/data/master/dp-vol):207:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 150, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 540, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1156, in service_loop
    g1.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 473, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1142, in crawl
    self.upd_stime(item[1][1], item[1][0])
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 890, in upd_stime
    self.sendmark(path, stime)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 664, in sendmark
    self.set_slave_xtime(path, mark)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 149, in set_slave_xtime
    self.slave.server.set_xtime_remote(path, self.uuid, mark)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 204, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 189, in __call__
    raise res
OSError: [Errno 2] No such file or directory
[2013-12-20 15:36:25.630029] I [syncdutils(/data/master/dp-vol):159:finalize] <top>: exiting.
[2013-12-20 15:36:25.648134] I [monitor(monitor):81:set_state] Monitor: new state: faulty
[2013-12-20 15:36:35.659291] I [monitor(monitor):129:monitor] Monitor: ------------------------------------------------------------
[2013-12-20 15:36:35.659557] I [monitor(monitor):130:monitor] Monitor: starting gsyncd worker
[2013-12-20 15:36:35.835516] I [gsyncd(/data/master/dp-vol):530:main_i] <top>: syncing: gluster://localhost:dp-vol ->ssh://root.intuit.net:gluster://localhost:dp-vol1
[2013-12-20 15:36:39.495806] I [master(/data/master/dp-vol):58:gmaster_builder] <top>: setting up xsync change detection mode
[2013-12-20 15:36:39.496217] I [master(/data/master/dp-vol):363:__init__] _GMaster: using 'rsync' as the sync engine
[2013-12-20 15:36:39.497653] I [master(/data/master/dp-vol):58:gmaster_builder] <top>: setting up changelog change detection mode
[2013-12-20 15:36:39.497888] I [master(/data/master/dp-vol):363:__init__] _GMaster: using 'rsync' as the sync engine
[2013-12-20 15:36:39.499078] I [master(/data/master/dp-vol):1108:register] _GMaster: xsync temp directory: /var/run/gluster/dp-vol/ssh

Comment 4 M S Vishwanath Bhat 2014-01-02 15:02:23 UTC
Aravinda: Do you have the steps to reproduce this? I tried with 51geo version and I couldn't reproduce it.

Comment 5 Venky Shankar 2014-01-07 12:29:42 UTC
MZ,

This was once observed in Neependra's setup and in Intuit. It was not reproducible after that. It's observed in midst of a normal sync operation.

Comment 6 M S Vishwanath Bhat 2014-01-13 06:33:49 UTC
I was not able to reproduce even after 2-3 tries. The behaviour was same with or without patch. Since I'm hitting the same issue, I will move to verified. Please re-open if seen again.

Tested in Version: glusterfs-3.4.0.53rhs-1.el6rhs.x86_64.rpm

Comment 7 Pavithra 2014-01-15 05:31:55 UTC
Aravinda,

Can you please verify if the edited doc text is technically correct?

Comment 8 Aravinda VK 2014-01-15 05:52:02 UTC
doc text looks good to me.

Comment 10 errata-xmlrpc 2014-02-25 08:13:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html


Note You need to log in before you can comment on or make changes to this bug.