1129208 – [Dist-geo-rep] : In a cascaded setup, rm -rf on master didn't completely propagated to level 2 slave.

Bug 1129208 - [Dist-geo-rep] : In a cascaded setup, rm -rf on master didn't completely propagated to level 2 slave.

Summary: [Dist-geo-rep] : In a cascaded setup, rm -rf on master didn't completely prop...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:	cascaded
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-08-12 10:06 UTC by Vijaykumar Koppad
Modified:	2018-04-16 15:56 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-04-16 15:56:17 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Vijaykumar Koppad 2014-08-12 10:06:18 UTC

Description of problem:  In A cascaded setup, rm -rf on master didn't completely propagated to level 2 slave. 

imaster(intermediate master) geo-rep logs had some trace-backs.
===============================================================================
[2014-08-11 17:02:42.58315] I [master(/bricks/brick2/b7):1147:crawl] _GMaster: slave's time: (1407756535, 0)
[2014-08-11 17:05:38.262269] E [repce(/bricks/brick2/b7):207:__call__] RepceClient: call 17308:140607044015872:1407756932.83 (entry_ops) failed on peer with OSError
[2014-08-11 17:05:38.262967] E [syncdutils(/bricks/brick2/b7):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1335, in service_loop
    g2.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 513, in crawlwrap
    self.crawl(no_stime_update=no_stime_update)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1159, in crawl
    self.process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 910, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 874, in process_change
    self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
    raise res
OSError: [Errno 5] Input/output error
[2014-08-11 17:05:38.266292] I [syncdutils(/bricks/brick2/b7):214:finalize] <top>: exiting.
[2014-08-11 17:05:38.269656] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.
[2014-08-11 17:05:38.270167] I [syncdutils(agent):214:finalize] <top>: exiting.
[2014-08-11 17:05:38.286237] I [monitor(monitor):109:set_state] Monitor: new state: faulty
[2014-08-11 17:05:38.371052] E [repce(/bricks/brick3/b11):207:__call__] RepceClient: call 17310:139915850999552:1407756928.11 (entry_ops) failed on peer with OSError
[2014-08-11 17:05:38.371644] E [syncdutils(/bricks/brick3/b11):270:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 643, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1335, in service_loop
    g2.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 513, in crawlwrap
    self.crawl(no_stime_update=no_stime_update)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1159, in crawl
    self.process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 910, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 874, in process_change
    self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
    raise res
OSError: [Errno 61] No data available
[2014-08-11 17:05:38.374772] I [syncdutils(/bricks/brick3/b11):214:finalize] <top>: exiting.
[2014-08-11 17:05:38.378520] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.
[2014-08-11 17:05:38.379053] I [syncdutils(agent):214:finalize] <top>: exiting.
[2014-08-11 17:05:48.300900] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------
[2014-08-11 17:05:48.301461] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker
[2014-08-11 17:05:48.399454] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------
[2014-08-11 17:05:48.400115] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker
[2014-08-11 17:05:48.720443] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining...
===============================================================================


Version-Release number of selected component (if applicable): glusterfs-3.6.0.27-1.el6rhs


How reproducible: Didn't try to reproduce the issue


Steps to Reproduce:
1. create and start a geo-rep cascaded setup between master, imaster and slave.
2. create some data on master and let it sync to imaster and slave.
3. then rm -rf on master mount-point. 
4. Check if it succeeds to remove from the slave. 

Actual results: rm -rf on master fails to propagate to slave level 2 


Expected results: rm -rf on master  should be remove all the files from all the slaves. 
 

Additional info:

Comment 1 Vijaykumar Koppad 2014-08-12 10:19:47 UTC

sosreport of all master, imaster and slave nodes @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1129208/

Note You need to log in before you can comment on or make changes to this bug.