1001980 – Dist-geo-rep : geo-rep created entry for 2 files with same name in same directory on slave after creating the hardlinks on master.

Bug 1001980 - Dist-geo-rep : geo-rep created entry for 2 files with same name in same directory on slave after creating the hardlinks on master.

Summary: Dist-geo-rep : geo-rep created entry for 2 files with same name in same dire...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	Sudhir D
Docs Contact:
URL:
Whiteboard:
Depends On:	984603
Blocks:
TreeView+	depends on / blocked

Reported:	2013-08-28 08:58 UTC by Vijaykumar Koppad
Modified:	2014-08-25 00:50 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-09-25 08:45:56 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Vijaykumar Koppad 2013-08-28 08:58:07 UTC

Description of problem: After creating hardlinks to all the files on master which were synced to slave, geo-rep created 2 files with same name in  same directory on the slave. 

On master one of the gsyncd had traceback like 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2013-08-28 12:32:32.902442] E [repce(/bricks/brick3):188:__call__] RepceClient: call 28210:140139662915328:137767335
0.78 (entry_ops) failed on peer with OSError
[2013-08-28 12:32:32.903069] E [syncdutils(/bricks/brick3):206:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 133, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 513, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1062, in service_loop
    g2.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 369, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 783, in crawl
    self.process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 744, in process
    if self.process_change(change, done, retry):
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 724, in process_change
    self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 204, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 189, in __call__
    raise res
OSError: [Errno 22] Invalid argument
[2013-08-28 12:32:32.905826] I [syncdutils(/bricks/brick3):158:finalize] <top>: exiting.
[2013-08-28 12:32:32.915518] I [monitor(monitor):81:set_state] Monitor: new state: faulty

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.

on slave it had traceback like


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2013-08-28 12:05:42.456854] I [resource(slave):630:service_loop] GLUSTER: slave listening
[2013-08-28 12:32:31.633302] E [repce(slave):103:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 99, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 522, in entry_ops
    errno_wrap(os.link, [slink, entry], [ENOENT, EEXIST])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 381, in errno_wrap
    return call(*arg)
OSError: [Errno 22] Invalid argument
[2013-08-28 12:32:31.645050] I [repce(slave):78:service_loop] RepceServer: terminating on reaching EOF.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Version-Release number of selected component (if applicable):glusterfs-3.4.0.24rhs-1.el6rhs.x86_64


How reproducible:Didn't try to reproduce again


Steps to Reproduce:
1.create and start a geo-rep relationship between master and slave. 
2.create files on the master using command, ./crefi.py -n 10 --multi  -b 10 -d 10 --random --max=500K --min=10 /mnt/master/
3.let it sync to slave
4. now create hardlinks to all those files on the master, ./crefi.py -n 10 --multi  -b 10 -d 10 --random --max=500K --min=10 --fop=hardlink /mnt/master/

Actual results:it created 2 files with same name in same directory on the slave.


Expected results: It should sync all files to slave properly. 


Additional info:

Comment 2 Vijaykumar Koppad 2013-08-31 13:30:22 UTC

This happened again in the build glusterfs-3.4.0.30rhs-2.el6rhs.x86_64, in cascaded-fanout setup .

Comment 3 Amar Tumballi 2013-09-11 13:29:52 UTC

Considering bug 1001498 is fixed, can we see if this is an issue anymore?

Comment 4 Vijaykumar Koppad 2013-09-12 06:42:35 UTC

I haven't seen this issue again after the build, glusterfs-3.4.0.31rhs-1. I had raised this bug because the symptoms were different.

Comment 5 Vivek Agarwal 2013-09-25 08:45:56 UTC

As per comment 4, closing this.

Note You need to log in before you can comment on or make changes to this bug.