1004716 – Dist-geo-rep : If a slave node goes down, the session connecting to that node doesn't go to faulty immediately and also doesn't sync files to slave.

Bug 1004716 - Dist-geo-rep : If a slave node goes down, the session connecting to that node doesn't go to faulty immediately and also doesn't sync files to slave.

Summary: Dist-geo-rep : If a slave node goes down, the session connecting to that node...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-09-05 10:25 UTC by Vijaykumar Koppad
Modified:	2015-11-25 08:51 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-11-25 08:49:54 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Vijaykumar Koppad 2013-09-05 10:25:18 UTC

Description of problem: If a slave node goes down, the geo-rep session connecting to that particular node from master doesn't go to faulty immediately,but it take 15 to 20 min to go to faulty and consequently it doesn't sync any files to the slave from that brick or subvolume on master. 
And once that node comes back, the connection will be established, but it does first xsync crawl. 

Version-Release number of selected component (if applicable):glusterfs-3.4.0.30rhs-2.el6rhs.x86_64


How reproducible: Happens everytime


Steps to Reproduce:
1.Create and start a geo-rep relationship between master and slave.
2.Bring down one of the node from slave.
3.create some files on the slave.
4. Check if those files are synced to slave or not / 

Actual results: If a slave machines goes down, that particular session doesn't 
go down immediately.

Expected results: It should goes down immediately and also shouldn't it should pick another sane slave machine to sync the data . 


Additional info:

Comment 1 Vijaykumar Koppad 2013-09-05 10:35:03 UTC

Earlier distribution of aux-mount on slave cluster was not there. We had a 
BUG 980049 for the same. This distribution on slave side was introduced in glusterfs-3.4.0.30rhs-2.el6rhs.x86_64.

Comment 3 Amar Tumballi 2013-09-05 12:52:15 UTC

Need a minor change (but significant meaning) in summary of the bug.

It doesn't fail to sync to slave forever. It fails to sync to slave for the period the slave machine is down. Once it comes back up, it works fine.

Comment 4 Scott Haines 2013-09-27 17:08:06 UTC

Targeting for 3.0.0 (Denali) release.

Comment 5 Nagaprasad Sathyanarayana 2014-05-06 11:43:38 UTC

Dev ack to 3.0 RHS BZs

Comment 9 Aravinda VK 2015-11-25 08:49:54 UTC

Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Comment 10 Aravinda VK 2015-11-25 08:51:28 UTC

Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Note You need to log in before you can comment on or make changes to this bug.