Bug 996624

Summary: Dist-geo-rep : geo-rep failed to sync few file to 2 slaves in fanout setup, if geo-rep was stopped and started during creation of files
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vijaykumar Koppad <vkoppad>
Component: geo-replicationAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED EOL QA Contact: storage-qa-internal <storage-qa-internal>
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: avishwan, chrisw, csaba, david.macdonald, rhs-bugs, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: fanout, consistency
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vijaykumar Koppad 2013-08-13 14:38:49 UTC
Description of problem: Geo-rep failed to sync few files to 2 slaves in a fanout setup of 1-4 , if the geo-rep was stopped and started during creation of files on the master. If you check  files on master which was missed on one of the slaves, those files didn't have entry in any of the files in .processed directory. and gfid of the missed files didn't entries in xsync changelogs too.  

geo-rep logs had few logs like 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2013-08-13 19:07:41.158914] W [master(/bricks/brick1):682:regjob] <top>: Rsync: .gfid/8586c8db-d73f-44fe-8071-cbeeaceccae7 [errcode: 23]
[2013-08-13 19:07:41.159651] W [master(/bricks/brick1):682:regjob] <top>: Rsync: .gfid/7201d1d8-7db2-4e53-b4a8-2921dfcd2a68 [errcode: 23]
[2013-08-13 19:07:41.160497] W [master(/bricks/brick1):682:regjob] <top>: Rsync: .gfid/3802bb86-8ef2-4219-8b67-95401e05b18d [errcode: 23]
[2013-08-13 19:07:41.161576] W [master(/bricks/brick1):682:regjob] <top>: Rsync: .gfid/ab5e2c00-97e0-441c-9527-1b349e0ee2d1 [errcode: 23]
[2013-08-13 19:07:41.162667] W [master(/bricks/brick1):682:regjob] <top>: Rsync: .gfid/c3b0cfbf-c047-4b4b-bcc9-af5444543f01 [errcode: 23]
[2013-08-13 19:07:41.163490] W [master(/bricks/brick1):682:regjob] <top>: Rsync: .gfid/88e5ccd9-a4f0-4494-b508-647750ac1e38 [errcode: 23]
[2013-08-13 19:07:41.164549] W [master(/bricks/brick1):809:process] _GMaster: incomplete sync, retrying changelog: /var/run/gluster/master/ssh%3A%2F%2Froot%4010.70.43.25%3Agluster%3A%2F%2F127.0.0.1%3Aslave_red/bd42ad17ef8864d51407b1c6478f5dc6/xsync/XSYNC-CHANGELOG.1376401016
[2013-08-13 19:07:53.364371] I [master(/bricks/brick1):343:crawlwrap] _GMaster: crawl interval: 3 seconds
[2013-08-13 19:07:53.412727] I [master(/bricks/brick1):366:crawlwrap] _GMaster: new master is 83083efa-4172-4da8-9c27-26900b1aaf6b
[2013-08-13 19:07:53.413120] I [master(/bricks/brick1):370:crawlwrap] _GMaster: primary master with volume id 83083efa-4172-4da8-9c27-26900b1aaf6b ...
[2013-08-13 19:07:53.836635] I [monitor(monitor):81:set_state] Monitor: new state: Stable
[2013-08-13 19:08:54.558949] I [master(/bricks/brick1):356:crawlwrap] _GMaster: 11 crawls, 5 turns
[2013-08-13 19:09:54.797264] I [master(/bricks/brick1):356:crawlwrap] _GMaster: 20 crawls, 0 turns

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 

Version-Release number of selected component (if applicable):glusterfs-3.4.0.18rhs-1.el6rhs.x86_64


How reproducible:Happens everytime 


Steps to Reproduce:
1.create and start master(dist-rep) and 4 slave volumes(2 dist-rep and 2 dist) 
2.Create and start geo-rep between master and all 4 slaves.
3.start creating files on master, while creating files on master stop and start all the geo-rep sessions
4. let it sync to slaves 

Actual results:geo-rep fails to sync few  files to few slaves  


Expected results:It shouldn't miss any files


Additional info:

Comment 3 Aravinda VK 2015-11-25 08:50:57 UTC
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Comment 4 Aravinda VK 2015-11-25 08:52:10 UTC
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.