| Summary: | Dist-geo-rep : worker process dies and started again frequently | |||
|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Rachana Patel <racpatel> | |
| Component: | geo-replication | Assignee: | Venky Shankar <vshankar> | |
| Status: | CLOSED ERRATA | QA Contact: | amainkar | |
| Severity: | medium | Docs Contact: | ||
| Priority: | high | |||
| Version: | 2.1 | CC: | aavati, amarts, csaba, rhs-bugs, surs | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.4.0.24rhs-1 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1003803 (view as bug list) | Environment: | ||
| Last Closed: | 2013-09-23 22:38:51 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1003803 | |||
|
Description
Rachana Patel
2013-08-22 08:37:54 UTC
Description of problem:
Dist-geo-rep : worker process dies and started again frequently
Version-Release number of selected component (if applicable):
3.4.0.20rhs-2.el6_4.x86_64
How reproducible:
always
Steps to Reproduce:
1.master cluster - 5 node ; volume - master1 (3x2)
mounted as FUSE on client and created data
[root@rhs-client22 nufa]# df -h /mnt/master1
Filesystem Size Used Avail Use% Mounted on
10.70.37.128:master1 150G 126G 25G 84% /mnt/master1
2. created geo rep session between master and slave cluster
3. check status after some times - restarted gsyncd worker after some times
[root@DVM1 nufa]# gluster volume geo master1 10.70.37.219::slave1 status
NODE MASTER SLAVE HEALTH UPTIME
---------------------------------------------------------------------------------------
DVM1.lab.eng.blr.redhat.com master1 10.70.37.219::slave1 Stable 00:03:45
DVM2.lab.eng.blr.redhat.com master1 10.70.37.219::slave1 Stable 01:48:11
DVM5.lab.eng.blr.redhat.com master1 10.70.37.219::slave1 Stable 00:20:17
DVM4.lab.eng.blr.redhat.com master1 10.70.37.219::slave1 faulty N/A
DVM6.lab.eng.blr.redhat.com master1 10.70.37.219::slave1 Stable 01:48:11
log snippet:-
[2013-08-22 06:01:17.362781] I [monitor(monitor):81:set_state] Monitor: new state: Stable
[2013-08-22 06:04:34.478760] I [master(/rhs/brick1):878:crawl] _GMaster: processing xsync changelog /var/run/gluster/master1/ssh%3A%2F%
2Froot%4010.70.37.219%3Agluster%3A%2F%2F127.0.0.1%3Aslave1/85acebcd7c65ee7c4550f76de44279a9/xsync/XSYNC-CHANGELOG.1377131419
[2013-08-22 06:04:49.218578] E [syncdutils(/rhs/brick1):206:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 133, in main
main_i()
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 513, in main_i
local.service_loop(*[r for r in [remote] if r])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1059, in service_loop
g1.crawlwrap(oneshot=True)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 369, in crawlwrap
self.crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 880, in crawl
self.process([self.fname()], done)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 734, in process
if self.process_change(change, done, retry):
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 696, in process_change
entries.append(edct(ty, stat=st, entry=en, gfid=gfid, link=os.readlink(en)))
OSError: [Errno 2] No such file or directory: '.gfid/572fabcb-e34f-4d09-889e-c2e99b0765ac/sbin-ip6tables-save.x86_64'
[2013-08-22 06:04:49.221683] I [syncdutils(/rhs/brick1):158:finalize] <top>: exiting.
[2013-08-22 06:04:49.236047] I [monitor(monitor):81:set_state] Monitor: new state: faulty
Actual results:
Expected results:
Additional info:
not able to reproduce with 3.4.0.32rhs-1.el6_4.x86_64 hence marking as verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |