1005478 – Dist-geo-rep: 'geo-rep stop' should fail when there is a node down

Bug 1005478 - Dist-geo-rep: 'geo-rep stop' should fail when there is a node down

Summary: Dist-geo-rep: 'geo-rep stop' should fail when there is a node down

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Avra Sengupta
QA Contact:	M S Vishwanath Bhat
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1005477 (view as bug list)
Depends On:
Blocks:	1006177
TreeView+	depends on / blocked

Reported:	2013-09-07 12:18 UTC by M S Vishwanath Bhat
Modified:	2016-06-01 01:56 UTC (History)
CC List:	9 users (show)
Fixed In Version:	glusterfs-3.4.0.34rhs
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1006177 (view as bug list)
Environment:
Last Closed:	2013-11-27 15:37:06 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2013:1769	0	normal	SHIPPED_LIVE	Red Hat Storage 2.1 enhancement and bug fix update #1	2013-11-27 20:17:39 UTC

Description M S Vishwanath Bhat 2013-09-07 12:18:38 UTC

Description of problem:
Right now geo-rep stop succeeds when there is a node/glusterd down. Running geo-rep stop will stop all the gsync processes in the nodes which are up, but when the node which was down comes back online, the gsync in that node would still be running. geo-rep stop force needs to be run again to stop the process. 

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.32rhs-1.el6rhs.x86_64

How reproducible:
Always


Steps to Reproduce:
1. Create and start a geo-rep session between 2 clusters
2. Now bring down a node (or kill glusterd in that node)
3. Run geo-rep stop on the master node.

Actual results:
[root@spitfire ]# gluster v geo master falcon::slave stop
Stopping geo-replication session between master & falcon::slave has been successful

But when the node which was down comes back online,

[root@spitfire ]# gluster v geo master falcon::slave status
NODE                       MASTER    SLAVE            HEALTH     UPTIME         
----------------------------------------------------------------------------
spitfire.blr.redhat.com    master    falcon::slave    Stopped    N/A            
mustang.blr.redhat.com     master    falcon::slave    Stopped    N/A            
harrier.blr.redhat.com     master    falcon::slave    Stable     01:52:26       
typhoon.blr.redhat.com     master    falcon::slave    Stopped    N/A            



Expected results:
Stop should fail or warn in case when the node is down.

Additional info:

Comment 1 M S Vishwanath Bhat 2013-09-07 12:22:39 UTC

*** Bug 1005477 has been marked as a duplicate of this bug. ***

Comment 3 M S Vishwanath Bhat 2013-09-07 14:00:23 UTC

The work around for this is to run geo-rep stop force when the node comes back online.

Comment 4 Amar Tumballi 2013-10-04 09:31:47 UTC

https://code.engineering.redhat.com/gerrit/#/c/12654/

Comment 5 Gowrishankar Rajaiyan 2013-10-08 08:42:03 UTC

Fixed in version please.

Comment 6 M S Vishwanath Bhat 2013-10-18 06:31:30 UTC

Fixed now.

Tested in version: glusterfs-3.4.0.35rhs-1.el6rhs.x86_64

When the node is down, 

[root@spitfire ]# gluster v geo master falcon::slave stop
Peer harrier, which is a part of master volume, is down. Please bring up the peer and retry.
geo-replication command failed


And when the peer is back online,

[root@spitfire ]# gluster v geo master falcon::slave stop
Stopping geo-replication session between master & falcon::slave has been successful

[root@spitfire ]# gluster v geo master falcon::slave status
NODE                       MASTER    SLAVE            HEALTH     UPTIME       
--------------------------------------------------------------------------
spitfire.blr.redhat.com    master    falcon::slave    Stopped    N/A          
typhoon.blr.redhat.com     master    falcon::slave    Stopped    N/A          
mustang.blr.redhat.com     master    falcon::slave    Stopped    N/A          
harrier.blr.redhat.com     master    falcon::slave    Stopped    N/A

Moving it to Verified.

Comment 8 errata-xmlrpc 2013-11-27 15:37:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html

Note You need to log in before you can comment on or make changes to this bug.