Bug 1001980

Summary: Dist-geo-rep : geo-rep created entry for 2 files with same name in same directory on slave after creating the hardlinks on master.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vijaykumar Koppad <vkoppad>
Component: geo-replicationAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: aavati, amarts, bbandari, csaba, rhs-bugs, shaines, surs, vagarwal, vkoppad
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-25 08:45:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 984603    
Bug Blocks:    

Description Vijaykumar Koppad 2013-08-28 08:58:07 UTC
Description of problem: After creating hardlinks to all the files on master which were synced to slave, geo-rep created 2 files with same name in  same directory on the slave. 

On master one of the gsyncd had traceback like 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2013-08-28 12:32:32.902442] E [repce(/bricks/brick3):188:__call__] RepceClient: call 28210:140139662915328:137767335
0.78 (entry_ops) failed on peer with OSError
[2013-08-28 12:32:32.903069] E [syncdutils(/bricks/brick3):206:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 133, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 513, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1062, in service_loop
    g2.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 369, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 783, in crawl
    self.process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 744, in process
    if self.process_change(change, done, retry):
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 724, in process_change
    self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 204, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 189, in __call__
    raise res
OSError: [Errno 22] Invalid argument
[2013-08-28 12:32:32.905826] I [syncdutils(/bricks/brick3):158:finalize] <top>: exiting.
[2013-08-28 12:32:32.915518] I [monitor(monitor):81:set_state] Monitor: new state: faulty

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.

on slave it had traceback like


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2013-08-28 12:05:42.456854] I [resource(slave):630:service_loop] GLUSTER: slave listening
[2013-08-28 12:32:31.633302] E [repce(slave):103:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 99, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 522, in entry_ops
    errno_wrap(os.link, [slink, entry], [ENOENT, EEXIST])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 381, in errno_wrap
    return call(*arg)
OSError: [Errno 22] Invalid argument
[2013-08-28 12:32:31.645050] I [repce(slave):78:service_loop] RepceServer: terminating on reaching EOF.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Version-Release number of selected component (if applicable):glusterfs-3.4.0.24rhs-1.el6rhs.x86_64


How reproducible:Didn't try to reproduce again


Steps to Reproduce:
1.create and start a geo-rep relationship between master and slave. 
2.create files on the master using command, ./crefi.py -n 10 --multi  -b 10 -d 10 --random --max=500K --min=10 /mnt/master/
3.let it sync to slave
4. now create hardlinks to all those files on the master, ./crefi.py -n 10 --multi  -b 10 -d 10 --random --max=500K --min=10 --fop=hardlink /mnt/master/

Actual results:it created 2 files with same name in same directory on the slave.


Expected results: It should sync all files to slave properly. 


Additional info:

Comment 2 Vijaykumar Koppad 2013-08-31 13:30:22 UTC
This happened again in the build glusterfs-3.4.0.30rhs-2.el6rhs.x86_64, in cascaded-fanout setup .

Comment 3 Amar Tumballi 2013-09-11 13:29:52 UTC
Considering bug 1001498 is fixed, can we see if this is an issue anymore?

Comment 4 Vijaykumar Koppad 2013-09-12 06:42:35 UTC
I haven't seen this issue again after the build, glusterfs-3.4.0.31rhs-1. I had raised this bug because the symptoms were different.

Comment 5 Vivek Agarwal 2013-09-25 08:45:56 UTC
As per comment 4, closing this.