1499392 – [geo-rep]: Improve the output message to reflect the real failure with schedule_georep script

Bug 1499392 - [geo-rep]: Improve the output message to reflect the real failure with schedule_georep script

Summary: [geo-rep]: Improve the output message to reflect the real failure with schedu...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	geo-replication
Sub Component:
Version:	3.12
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Kotresh HR
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1475475 1499159
Blocks:
TreeView+	depends on / blocked

Reported:	2017-10-07 03:22 UTC by Kotresh HR
Modified:	2017-10-13 12:47 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-glusterfs-3.12.2
Clone Of:	1499159
Environment:
Last Closed:	2017-10-13 12:47:15 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kotresh HR 2017-10-07 03:22:47 UTC

+++ This bug was initially created as a clone of Bug #1499159 +++

+++ This bug was initially created as a clone of Bug #1475475 +++

Description of problem:
=======================

Currently if we manually check the geo-rep status or stop it with "invalid slave host, or slave volume". It throws right warning as:

[root@dhcp42-79 MASTER]# gluster volume geo-replication MASTER 10.70.41.209::SLAV status 
No active geo-replication sessions between MASTER and 10.70.41.209::SLAV
[root@dhcp42-79 MASTER]# gluster volume geo-replication MASTER 10.70.41.209::SLAV stop
Geo-replication session between MASTER and 10.70.41.209::SLAV does not exist.
geo-replication command failed
[root@dhcp42-79 MASTER]#

But if schedule_georep script is passed with invalid slave host and volume information it fails with "commit failed on localhost" as:

[root@dhcp42-79 MASTER]# time python /usr/share/glusterfs/scripts/schedule_georep.py MASTER 10.70.41.29 SLAVE
[NOT OK] 
Commit failed on localhost. Please check the log file for more details.


The problem with above output is it doesnt give picture whether something is down at slave (gsyncd, slave volume) or wrong slave information is provided. Also, which logs should user look into?

If geo-replication stop/status has failed, it should print the similar messages as it prints when executed manually. 

Version-Release number of selected component (if applicable):
=============================================================

glusterfs-geo-replication-3.8.4-35.el7rhgs.x86_64

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-07-26 14:31:23 EDT ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.3.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-07-28 05:02:12 EDT ---

This BZ having been considered, and subsequently not approved to be fixed at the RHGS 3.3.0 release, is being proposed for the next minor release of RHGS

--- Additional comment from Kotresh HR on 2017-10-06 05:39:42 EDT ---


Description of problem:
=======================

Currently if we manually check the geo-rep status or stop it with "invalid slave host, or slave volume". It throws right warning as:

[root@dhcp42-79 MASTER]# gluster volume geo-replication MASTER 10.70.41.209::SLAV status 
No active geo-replication sessions between MASTER and 10.70.41.209::SLAV
[root@dhcp42-79 MASTER]# gluster volume geo-replication MASTER 10.70.41.209::SLAV stop
Geo-replication session between MASTER and 10.70.41.209::SLAV does not exist.
geo-replication command failed
[root@dhcp42-79 MASTER]#

But if schedule_georep script is passed with invalid slave host and volume information it fails with "commit failed on localhost" as:

[root@dhcp42-79 MASTER]# time python /usr/share/glusterfs/scripts/schedule_georep.py MASTER 10.70.41.29 SLAVE
[NOT OK] 
Commit failed on localhost. Please check the log file for more details.


The problem with above output is it doesnt give picture whether something is down at slave (gsyncd, slave volume) or wrong slave information is provided. Also, which logs should user look into?

If geo-replication stop/status has failed, it should print the similar messages as it prints when executed manually. 

Version-Release number of selected component (if applicable):
=============================================================

mainline

--- Additional comment from Worker Ant on 2017-10-06 05:42:17 EDT ---

REVIEW: https://review.gluster.org/18442 (geo-rep/scheduler: Add validation for session existence) posted (#1) for review on master by Kotresh HR (khiremat)

--- Additional comment from Worker Ant on 2017-10-06 23:16:29 EDT ---

COMMIT: https://review.gluster.org/18442 committed in master by Kotresh HR (khiremat) 
------
commit 938addeb7ec634e431c2c8c0a768a2a9ed056c0d
Author: Kotresh HR <khiremat>
Date:   Fri Oct 6 05:33:31 2017 -0400

    geo-rep/scheduler: Add validation for session existence
    
    Added validation to check for session existence
    to give out proper error message out.
    
    Change-Id: I13c5f6ef29c1395cff092a14e1bd2c197a39f058
    BUG: 1499159
    Signed-off-by: Kotresh HR <khiremat>

Comment 1 Worker Ant 2017-10-07 03:24:56 UTC

REVIEW: https://review.gluster.org/18446 (geo-rep/scheduler: Add validation for session existence) posted (#1) for review on release-3.12 by Kotresh HR (khiremat)

Comment 2 Worker Ant 2017-10-12 18:44:44 UTC

COMMIT: https://review.gluster.org/18446 committed in release-3.12 by jiffin tony Thottan (jthottan) 
------
commit 1e425a3aea627f902434ca5b8252ee64cfa32c3d
Author: Kotresh HR <khiremat>
Date:   Fri Oct 6 05:33:31 2017 -0400

    geo-rep/scheduler: Add validation for session existence
    
    Added validation to check for session existence
    to give out proper error message out.
    
    > Change-Id: I13c5f6ef29c1395cff092a14e1bd2c197a39f058
    > BUG: 1499159
    > Signed-off-by: Kotresh HR <khiremat>
    (cherry picked from commit 938addeb7ec634e431c2c8c0a768a2a9ed056c0d)
    
    
    Change-Id: I13c5f6ef29c1395cff092a14e1bd2c197a39f058
    BUG: 1499392
    Signed-off-by: Kotresh HR <khiremat>

Comment 3 Jiffin 2017-10-13 12:47:15 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-glusterfs-3.12.2, please open a new bug report.

glusterfs-glusterfs-3.12.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-October/032684.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.