1094168 – dist-geo-rep: Geo-rep is broken in new build "glusterfs-3.5qa2-0.425.git9360107"

Bug 1094168 - dist-geo-rep: Geo-rep is broken in new build "glusterfs-3.5qa2-0.425.git9360107"

Summary: dist-geo-rep: Geo-rep is broken in new build "glusterfs-3.5qa2-0.425.git9360107"

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.0
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Aravinda VK
QA Contact:	Bhaskar Bandari
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1096022
TreeView+	depends on / blocked

Reported:	2014-05-05 08:31 UTC by Vijaykumar Koppad
Modified:	2015-05-13 16:55 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.6.0.2-1.el5
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1096022 (view as bug list)
Environment:
Last Closed:	2014-09-22 19:36:24 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1278	0	normal	SHIPPED_LIVE	Red Hat Storage Server 3.0 bug fix and enhancement update	2014-09-22 23:26:55 UTC

Description Vijaykumar Koppad 2014-05-05 08:31:47 UTC

Description of problem: geo-rep worker dies in startup phase with following python traceback,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-05-05 13:39:46.172435] E [syncdutils(/bricks/master_brick1):267:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 163, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 620, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1326, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 501, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1169, in crawl
    self.changelog_register_time)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 689, in history_changelog
    return Changes.cl_history_changelog(changelog_path, start, end)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 92, in cl_history_changelog
    ret = cls._get_api('gf_history_changelog')(changelog_path, start, end)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 34, in _get_api
    return getattr(cls.libgfc, call)
  File "/usr/lib64/python2.6/ctypes/__init__.py", line 366, in __getattr__
    func = self.__getitem__(name)
  File "/usr/lib64/python2.6/ctypes/__init__.py", line 371, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib64/libgfchangelog.so.0: undefined symbol: gf_history_changelog
 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Version-Release number of selected component (if applicable):  glusterfs-3.5qa2-0.425.git9360107


How reproducible: always


Steps to Reproduce:
1.create and start a geo-rep relationship between master and slave.


Actual results: worker dies in startup phase 


Expected results: It shouldn't die in startup phase. 


Additional info:

Comment 1 Aravinda VK 2014-05-09 06:22:44 UTC

Upstream patches available to fix this issue

1. http://review.gluster.org/#/c/6930/
2. http://review.gluster.org/#/c/7660/

Comment 2 Nagaprasad Sathyanarayana 2014-05-15 04:57:02 UTC

All necessary patches merged in both upstream master and RHS 3.0 (downstream). Hence moving this to modified.

Comment 3 Vijaykumar Koppad 2014-05-16 09:03:33 UTC

With new build glusterfs-3.6.0.2-1.el5, still got two similar tracebacks.
But it didn't get stuck there. It recovered back and started working properly. 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-05-16 12:06:28.549020] I [master(/bricks/master_brick1):463:crawlwrap] _GMaster: crawl interval: 1 seconds
[2014-05-16 12:06:28.566447] E [repce(agent):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 51, in history
    num_parallel)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 89, in cl_history_changelog
    ret = cls._get_api('gf_history_changelog')(changelog_path, start, end,
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 31, in _get_api
    return getattr(cls.libgfc, call)
  File "/usr/lib64/python2.6/ctypes/__init__.py", line 366, in __getattr__
    func = self.__getitem__(name)
  File "/usr/lib64/python2.6/ctypes/__init__.py", line 371, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib64/libgfchangelog.so.0: undefined symbol: gf_history_changelog
[2014-05-16 12:06:28.568813] E [repce(/bricks/master_brick1):207:__call__] RepceClient: call 23586:140193964734208:1400222188.57 (history) failed on peer with AttributeError
[2014-05-16 12:06:28.569288] E [syncdutils(/bricks/master_brick1):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 633, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1298, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 501, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1165, in crawl
    int(gconf.sync_jobs))
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
    raise res
AttributeError: /usr/lib64/libgfchangelog.so.0: undefined symbol: gf_history_changelog
[2014-05-16 12:06:28.572402] I [syncdutils(/bricks/master_brick1):214:finalize] <top>: exiting.
[2014-05-16 12:06:28.577449] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.
[2014-05-16 12:06:28.577883] I [syncdutils(agent):214:finalize] <top>: exiting.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Comment 4 Nagaprasad Sathyanarayana 2014-05-19 10:56:35 UTC

Setting flags required to add BZs to RHS 3.0 Errata

Comment 6 Vijaykumar Koppad 2014-06-03 12:52:48 UTC

verified on the build glusterfs-3.6.0.10-1.el6rhs

Comment 10 errata-xmlrpc 2014-09-22 19:36:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html

Note You need to log in before you can comment on or make changes to this bug.