Red Hat Bugzilla – Bug 980117
Dist-geo-rep: Geo-rep status of replica pairs goes to faulty intermittently.
Last modified: 2014-08-24 20:50:02 EDT
Description of problem: If you start a geo-rep session with dist-rep master volume, intermittently the status of the other replica pair goes to faulty from where there is no syncing happening, which is the gsync which is idle. This happens intermittently.
This is the excerpt from the idle gsync log file,
[2013-07-01 18:24:03.26717] D [master(/bricks/brick2):757:volinfo_state_machine] <top>: (None, f92305f7) << (None, f92305f7) -> (None, f92305f7)
[2013-07-01 18:24:06.538655] E [syncdutils(/bricks/brick2):189:log_raise_exception] <top>: connection to peer is broken
[2013-07-01 18:24:06.541389] E [syncdutils(/bricks/brick2):206:log_raise_exception] <top>: FULL EXCEPTION TRACE:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 232, in twrap
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 157, in listen
rid, exc, res = recv(self.inf)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 48, in recv
[2013-07-01 18:24:06.543339] I [syncdutils(/bricks/brick2):158:finalize] <top>: exiting.
[2013-07-01 18:24:06.551038] I [monitor(monitor):81:set_state] Monitor: new state: faulty,
Version-Release number of selected component (if applicable):glusterfs-220.127.116.11rhs.beta1-1.el6rhs.x86_64
How reproducible: Intermittent
Steps to Reproduce:
1.Create and start a geo-rep session with dist-rep master and slave
2.Create lot of data on the master, like untar a kernel.
3.Check the status of the geo-rep
Actual results: Sometime geo-rep rep status of the replica pairs goes to faulty
Expected results: Status should be stable.
I have a fix for this. Will send out the patch soon.
*** Bug 980734 has been marked as a duplicate of this bug. ***
Verified on glusterfs-18.104.22.168rhs-1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
For information on the advisory, and where to find the updated files, follow the link below.
If the solution does not work for you, open a new bug report.