Bug 1122502

Summary:	[Dist-geo-rep] : after changelog disable during geo-rep, if worker restarts it keeps crashing with "AttributeError: '_GMaster' object has no attribute 'changelog_register_time'"
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Vijaykumar Koppad <vkoppad>
Component:	geo-replication	Assignee:	Aravinda VK <avishwan>
Status:	CLOSED ERRATA	QA Contact:	Bhaskar Bandari <bbandari>
Severity:	high	Docs Contact:
Priority:	high
Version:	rhgs-3.0	CC:	aavati, ajha, avishwan, bbandari, csaba, david.macdonald, nlevinki, ssamanta
Target Milestone:	---
Target Release:	RHGS 3.0.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.6.0.27-1	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-09-22 19:44:55 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1111577
Bug Blocks:

Description Vijaykumar Koppad 2014-07-23 11:55:46 UTC

Description of problem: after changelog disable during geo-rep, if worker restarts it keeps crashing with "AttributeError: '_GMaster' object has no attribute 'changelog_register_time'". This happened when changelog was disable while creating and syncing hardlinks in geo-rep setup. But worked crashed with following traceback and this issue is being tracked with Bug 1098426
===========================================================================
[2014-07-23 16:34:09.840315] E [syncdutils(/bricks/brick3/master_b9):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1323, in service_loop
    g1.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 502, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1272, in crawl
    self.process([item[1]], 0)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 899, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 863, in process_change
    self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
    raise res
OSError: [Errno 22] Invalid argument
[2014-07-23 16:34:09.846282] I [syncdutils(/bricks/brick3/master_b9):214:finalize] <top>: exiting.
===========================================================================

This caused restart of the worker, and then it got stuck by crashing with AttributeError everytime it restarts. 
===========================================================================
[2014-07-23 16:34:34.740226] E [syncdutils(/bricks/brick3/master_b9):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1302, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 502, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1175, in crawl
    (purge_time[0], self.changelog_register_time))
AttributeError: '_GMaster' object has no attribute 'changelog_register_time'
[2014-07-23 16:34:34.747419] I [syncdutils(/bricks/brick3/master_b9):214:finalize] <top>: exiting.
===========================================================================


Version-Release number of selected component (if applicable): glusterfs-3.6.0.25-1.el6rhs


How reproducible: If conditions satisfy, it will happen everytime


Steps to Reproduce:
1. create and start a geo-rep relationship between master and slave. 
2. create data on master and let it sync on slave. 
3. then start creating hardlinks to created data and while creating hardlinks, disable changelog
4. If it doesn't hit with above steps, try to kill worker after changelog disable and before worker goes to hybrid crawl. 


Actual results: worker keeps crashing everytime it restarts 


Expected results: worker shouldn't crash. 


Additional info:

Comment 2 Aravinda VK 2014-07-25 07:16:47 UTC

Downstream patch merged as part of BZ 1111577
https://code.engineering.redhat.com/gerrit/#/c/29711/

Comment 3 Vijaykumar Koppad 2014-08-06 09:08:53 UTC

Tried with the build glusterfs-3.6.0.27-1.el6rhs, though there is other issue, this problem is fixed. Tracking the other issue with the Bug 1127102 and verifying this bug.

Comment 7 errata-xmlrpc 2014-09-22 19:44:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html