1122502 – [Dist-geo-rep] : after changelog disable during geo-rep, if worker restarts it keeps crashing with "AttributeError: '_GMaster' object has no attribute 'changelog_register_time'"

Bug 1122502 - [Dist-geo-rep] : after changelog disable during geo-rep, if worker restarts it keeps crashing with "AttributeError: '_GMaster' object has no attribute 'changelog_register_time'"

Summary: [Dist-geo-rep] : after changelog disable during geo-rep, if worker restarts i...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Aravinda VK
QA Contact:	Bhaskar Bandari
Docs Contact:
URL:
Whiteboard:
Depends On:	1111577
Blocks:
TreeView+	depends on / blocked

Reported:	2014-07-23 11:55 UTC by Vijaykumar Koppad
Modified:	2015-05-13 16:52 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.6.0.27-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-09-22 19:44:55 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1278	0	normal	SHIPPED_LIVE	Red Hat Storage Server 3.0 bug fix and enhancement update	2014-09-22 23:26:55 UTC

Description Vijaykumar Koppad 2014-07-23 11:55:46 UTC

Description of problem: after changelog disable during geo-rep, if worker restarts it keeps crashing with "AttributeError: '_GMaster' object has no attribute 'changelog_register_time'". This happened when changelog was disable while creating and syncing hardlinks in geo-rep setup. But worked crashed with following traceback and this issue is being tracked with Bug 1098426
===========================================================================
[2014-07-23 16:34:09.840315] E [syncdutils(/bricks/brick3/master_b9):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1323, in service_loop
    g1.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 502, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1272, in crawl
    self.process([item[1]], 0)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 899, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 863, in process_change
    self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
    raise res
OSError: [Errno 22] Invalid argument
[2014-07-23 16:34:09.846282] I [syncdutils(/bricks/brick3/master_b9):214:finalize] <top>: exiting.
===========================================================================

This caused restart of the worker, and then it got stuck by crashing with AttributeError everytime it restarts. 
===========================================================================
[2014-07-23 16:34:34.740226] E [syncdutils(/bricks/brick3/master_b9):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1302, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 502, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1175, in crawl
    (purge_time[0], self.changelog_register_time))
AttributeError: '_GMaster' object has no attribute 'changelog_register_time'
[2014-07-23 16:34:34.747419] I [syncdutils(/bricks/brick3/master_b9):214:finalize] <top>: exiting.
===========================================================================


Version-Release number of selected component (if applicable): glusterfs-3.6.0.25-1.el6rhs


How reproducible: If conditions satisfy, it will happen everytime


Steps to Reproduce:
1. create and start a geo-rep relationship between master and slave. 
2. create data on master and let it sync on slave. 
3. then start creating hardlinks to created data and while creating hardlinks, disable changelog
4. If it doesn't hit with above steps, try to kill worker after changelog disable and before worker goes to hybrid crawl. 


Actual results: worker keeps crashing everytime it restarts 


Expected results: worker shouldn't crash. 


Additional info:

Comment 2 Aravinda VK 2014-07-25 07:16:47 UTC

Downstream patch merged as part of BZ 1111577
https://code.engineering.redhat.com/gerrit/#/c/29711/

Comment 3 Vijaykumar Koppad 2014-08-06 09:08:53 UTC

Tried with the build glusterfs-3.6.0.27-1.el6rhs, though there is other issue, this problem is fixed. Tracking the other issue with the Bug 1127102 and verifying this bug.

Comment 7 errata-xmlrpc 2014-09-22 19:44:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html

Note You need to log in before you can comment on or make changes to this bug.