Bug 1005478 - Dist-geo-rep: 'geo-rep stop' should fail when there is a node down
Summary: Dist-geo-rep: 'geo-rep stop' should fail when there is a node down
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication
Version: 2.1
Hardware: x86_64
OS: Linux
medium
low
Target Milestone: ---
: ---
Assignee: Avra Sengupta
QA Contact: M S Vishwanath Bhat
URL:
Whiteboard:
: 1005477 (view as bug list)
Depends On:
Blocks: 1006177
TreeView+ depends on / blocked
 
Reported: 2013-09-07 12:18 UTC by M S Vishwanath Bhat
Modified: 2016-06-01 01:56 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.4.0.34rhs
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1006177 (view as bug list)
Environment:
Last Closed: 2013-11-27 15:37:06 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1769 0 normal SHIPPED_LIVE Red Hat Storage 2.1 enhancement and bug fix update #1 2013-11-27 20:17:39 UTC

Description M S Vishwanath Bhat 2013-09-07 12:18:38 UTC
Description of problem:
Right now geo-rep stop succeeds when there is a node/glusterd down. Running geo-rep stop will stop all the gsync processes in the nodes which are up, but when the node which was down comes back online, the gsync in that node would still be running. geo-rep stop force needs to be run again to stop the process. 

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.32rhs-1.el6rhs.x86_64

How reproducible:
Always


Steps to Reproduce:
1. Create and start a geo-rep session between 2 clusters
2. Now bring down a node (or kill glusterd in that node)
3. Run geo-rep stop on the master node.

Actual results:
[root@spitfire ]# gluster v geo master falcon::slave stop
Stopping geo-replication session between master & falcon::slave has been successful

But when the node which was down comes back online,

[root@spitfire ]# gluster v geo master falcon::slave status
NODE                       MASTER    SLAVE            HEALTH     UPTIME         
----------------------------------------------------------------------------
spitfire.blr.redhat.com    master    falcon::slave    Stopped    N/A            
mustang.blr.redhat.com     master    falcon::slave    Stopped    N/A            
harrier.blr.redhat.com     master    falcon::slave    Stable     01:52:26       
typhoon.blr.redhat.com     master    falcon::slave    Stopped    N/A            



Expected results:
Stop should fail or warn in case when the node is down.

Additional info:

Comment 1 M S Vishwanath Bhat 2013-09-07 12:22:39 UTC
*** Bug 1005477 has been marked as a duplicate of this bug. ***

Comment 3 M S Vishwanath Bhat 2013-09-07 14:00:23 UTC
The work around for this is to run geo-rep stop force when the node comes back online.

Comment 5 Gowrishankar Rajaiyan 2013-10-08 08:42:03 UTC
Fixed in version please.

Comment 6 M S Vishwanath Bhat 2013-10-18 06:31:30 UTC
Fixed now.

Tested in version: glusterfs-3.4.0.35rhs-1.el6rhs.x86_64

When the node is down, 

[root@spitfire ]# gluster v geo master falcon::slave stop
Peer harrier, which is a part of master volume, is down. Please bring up the peer and retry.
geo-replication command failed


And when the peer is back online,

[root@spitfire ]# gluster v geo master falcon::slave stop
Stopping geo-replication session between master & falcon::slave has been successful

[root@spitfire ]# gluster v geo master falcon::slave status
NODE                       MASTER    SLAVE            HEALTH     UPTIME       
--------------------------------------------------------------------------
spitfire.blr.redhat.com    master    falcon::slave    Stopped    N/A          
typhoon.blr.redhat.com     master    falcon::slave    Stopped    N/A          
mustang.blr.redhat.com     master    falcon::slave    Stopped    N/A          
harrier.blr.redhat.com     master    falcon::slave    Stopped    N/A

Moving it to Verified.

Comment 8 errata-xmlrpc 2013-11-27 15:37:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html


Note You need to log in before you can comment on or make changes to this bug.