Bug 1002991

Summary:	Dist-geo-rep: errors in log related to syncdutils.py and monitor.py (status is Stable though)
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Rachana Patel <racpatel>
Component:	geo-replication	Assignee:	Aravinda VK <avishwan>
Status:	CLOSED ERRATA	QA Contact:	Rahul Hinduja <rhinduja>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	2.1	CC:	aavati, annair, avishwan, csaba, mzywusko, rhinduja, rhs-bugs, vagarwal
Target Milestone:	---
Target Release:	RHGS 3.1.0
Hardware:	x86_64
OS:	Linux
Whiteboard:	upstream_merged
Fixed In Version:	glusterfs-3.7.0-2.el6rhs	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-07-29 04:28:43 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1202842, 1223636

Description Rachana Patel 2013-08-30 12:35:09 UTC

Description of problem:
Dist-geo-rep: errors in log  related to syncdutils.py and monitor.py (status is Stable though)

Version-Release number of selected component (if applicable):
3.4.0.24rhs-1.el6rhs.x86_64

How reproducible:
not tried

Steps to Reproduce:
1. had a dist-rep volume. created data on it.
- created geo rep session between master and slave cluster
[root@4DVM5 ~]# gluster v info
 
Volume Name: 4_master1
Type: Distributed-Replicate
Volume ID: 6b520d9e-3370-4b57-9cf1-e6478e5bcfec
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.37.110:/rhs/brick1/1
Brick2: 10.70.37.81:/rhs/brick1/1
Brick3: 10.70.37.110:/rhs/brick2/1
Brick4: 10.70.37.81:/rhs/brick2/1
Brick5: 10.70.37.110:/rhs/brick3/1
Brick6: 10.70.37.81:/rhs/brick3/1
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on

- status was stable but got error in log as below (got it only once)


Actual results:
log snippet:
[root@4DVM5 ~]# ifconfig | grep inet
          inet addr:10.70.37.81  Bcast:10.70.37.255  Mask:255.255.254.0
          inet addr:127.0.0.1  Mask:255.0.0.0
[root@4DVM5 ~]# less /var/log/glusterfs/geo-replication/4_master1/ssh%3A%2F%2Froot%4010.70.37.1%3Agluster%3A%2F%2F127.0.0.1%3A4_slave1.log


2013-08-30 06:37:13.278582] I [master(/rhs/brick1/1):345:crawlwrap] _GMaster: crawl interval: 60 seconds
[2013-08-30 06:37:13.445154] I [master(/rhs/brick3/1):335:crawlwrap] _GMaster: primary master with volume id 6b520d9e-3370-4b57-9cf1-e6478e5bcfec ...
[2013-08-30 06:37:13.449449] I [master(/rhs/brick3/1):345:crawlwrap] _GMaster: crawl interval: 60 seconds
[2013-08-30 06:38:10.259210] E [syncdutils(monitor):206:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 232, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 203, in wmon
    cpid, _ = self.monitor(w, argv, cpids)
  File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 161, in monitor
    self.terminate()
  File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 89, in terminate
    set_term_handler(lambda *a: set_term_handler())
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 298, in set_term_handler
    signal(SIGTERM, hook)
ValueError: signal only works in main thread
[2013-08-30 06:38:10.282051] I [syncdutils(monitor):158:finalize] <top>: exiting.
[2013-08-30 06:38:13.348395] I [master(/rhs/brick1/1):358:crawlwrap] _GMaster: 0 crawls, 0 turns
[2013-08-30 06:38:13.348732] I [master(/rhs/brick2/1):358:crawlwrap] _GMaster: 0 cr

Expected results:


Additional info:

Comment 10 Aravinda VK 2015-07-14 06:22:40 UTC

Similar to BZ 1044420

Comment 11 Rahul Hinduja 2015-07-16 12:20:31 UTC

Have carried the steps mentioned in the description also performed the automation for geo-rep both with rsync and tarssh over fuse and nfs mount with the build: glusterfs-3.7.1-9.el6rhs.x86_64

Didn't observe this issue and something similar was verified with bz: 1044420. Hence moving this bug too to verified. Will create new or reopen if the traceback is seen with proper steps for reporduction.

Comment 13 errata-xmlrpc 2015-07-29 04:28:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html