Bug 1499159

Summary: [geo-rep]: Improve the output message to reflect the real failure with schedule_georep script
Product: [Community] GlusterFS Reporter: Kotresh HR <khiremat>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: mainlineCC: avishwan, bugs, csaba, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.13.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1475475
: 1499392 (view as bug list) Environment:
Last Closed: 2017-12-08 17:42:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1475475    
Bug Blocks: 1499392    

Comment 1 Kotresh HR 2017-10-06 09:39:42 UTC
Description of problem:
=======================

Currently if we manually check the geo-rep status or stop it with "invalid slave host, or slave volume". It throws right warning as:

[root@dhcp42-79 MASTER]# gluster volume geo-replication MASTER 10.70.41.209::SLAV status 
No active geo-replication sessions between MASTER and 10.70.41.209::SLAV
[root@dhcp42-79 MASTER]# gluster volume geo-replication MASTER 10.70.41.209::SLAV stop
Geo-replication session between MASTER and 10.70.41.209::SLAV does not exist.
geo-replication command failed
[root@dhcp42-79 MASTER]#

But if schedule_georep script is passed with invalid slave host and volume information it fails with "commit failed on localhost" as:

[root@dhcp42-79 MASTER]# time python /usr/share/glusterfs/scripts/schedule_georep.py MASTER 10.70.41.29 SLAVE
[NOT OK] 
Commit failed on localhost. Please check the log file for more details.


The problem with above output is it doesnt give picture whether something is down at slave (gsyncd, slave volume) or wrong slave information is provided. Also, which logs should user look into?

If geo-replication stop/status has failed, it should print the similar messages as it prints when executed manually. 

Version-Release number of selected component (if applicable):
=============================================================

mainline

Comment 2 Worker Ant 2017-10-06 09:42:17 UTC
REVIEW: https://review.gluster.org/18442 (geo-rep/scheduler: Add validation for session existence) posted (#1) for review on master by Kotresh HR (khiremat)

Comment 3 Worker Ant 2017-10-07 03:16:29 UTC
COMMIT: https://review.gluster.org/18442 committed in master by Kotresh HR (khiremat) 
------
commit 938addeb7ec634e431c2c8c0a768a2a9ed056c0d
Author: Kotresh HR <khiremat>
Date:   Fri Oct 6 05:33:31 2017 -0400

    geo-rep/scheduler: Add validation for session existence
    
    Added validation to check for session existence
    to give out proper error message out.
    
    Change-Id: I13c5f6ef29c1395cff092a14e1bd2c197a39f058
    BUG: 1499159
    Signed-off-by: Kotresh HR <khiremat>

Comment 4 Shyamsundar 2017-12-08 17:42:08 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report.

glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html
[2] https://www.gluster.org/pipermail/gluster-users/