Description of problem: Dist-geo-rep: errors in log related to syncdutils.py and monitor.py (status is Stable though) Version-Release number of selected component (if applicable): 3.4.0.24rhs-1.el6rhs.x86_64 How reproducible: not tried Steps to Reproduce: 1. had a dist-rep volume. created data on it. - created geo rep session between master and slave cluster [root@4DVM5 ~]# gluster v info Volume Name: 4_master1 Type: Distributed-Replicate Volume ID: 6b520d9e-3370-4b57-9cf1-e6478e5bcfec Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: 10.70.37.110:/rhs/brick1/1 Brick2: 10.70.37.81:/rhs/brick1/1 Brick3: 10.70.37.110:/rhs/brick2/1 Brick4: 10.70.37.81:/rhs/brick2/1 Brick5: 10.70.37.110:/rhs/brick3/1 Brick6: 10.70.37.81:/rhs/brick3/1 Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on - status was stable but got error in log as below (got it only once) Actual results: log snippet: [root@4DVM5 ~]# ifconfig | grep inet inet addr:10.70.37.81 Bcast:10.70.37.255 Mask:255.255.254.0 inet addr:127.0.0.1 Mask:255.0.0.0 [root@4DVM5 ~]# less /var/log/glusterfs/geo-replication/4_master1/ssh%3A%2F%2Froot%4010.70.37.1%3Agluster%3A%2F%2F127.0.0.1%3A4_slave1.log 2013-08-30 06:37:13.278582] I [master(/rhs/brick1/1):345:crawlwrap] _GMaster: crawl interval: 60 seconds [2013-08-30 06:37:13.445154] I [master(/rhs/brick3/1):335:crawlwrap] _GMaster: primary master with volume id 6b520d9e-3370-4b57-9cf1-e6478e5bcfec ... [2013-08-30 06:37:13.449449] I [master(/rhs/brick3/1):345:crawlwrap] _GMaster: crawl interval: 60 seconds [2013-08-30 06:38:10.259210] E [syncdutils(monitor):206:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 232, in twrap tf(*aa) File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 203, in wmon cpid, _ = self.monitor(w, argv, cpids) File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 161, in monitor self.terminate() File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 89, in terminate set_term_handler(lambda *a: set_term_handler()) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 298, in set_term_handler signal(SIGTERM, hook) ValueError: signal only works in main thread [2013-08-30 06:38:10.282051] I [syncdutils(monitor):158:finalize] <top>: exiting. [2013-08-30 06:38:13.348395] I [master(/rhs/brick1/1):358:crawlwrap] _GMaster: 0 crawls, 0 turns [2013-08-30 06:38:13.348732] I [master(/rhs/brick2/1):358:crawlwrap] _GMaster: 0 cr Expected results: Additional info:
Similar to BZ 1044420
Have carried the steps mentioned in the description also performed the automation for geo-rep both with rsync and tarssh over fuse and nfs mount with the build: glusterfs-3.7.1-9.el6rhs.x86_64 Didn't observe this issue and something similar was verified with bz: 1044420. Hence moving this bug too to verified. Will create new or reopen if the traceback is seen with proper steps for reporduction.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html