1464352 – volume section is not ignoring errors by default while triggering rebalance

Bug 1464352 - volume section is not ignoring errors by default while triggering rebalance

Summary: volume section is not ignoring errors by default while triggering rebalance

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	gdeploy
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 3.3.0
Assignee:	Sachidananda Urs
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:	3.3.0-devel-freeze-exception
Depends On:
Blocks:	1417151
TreeView+	depends on / blocked

Reported:	2017-06-23 08:08 UTC by SATHEESARAN
Modified:	2017-09-21 04:49 UTC (History)
CC List:	6 users (show)
Fixed In Version:	gdeploy-2.0.2-12
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-21 04:49:50 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:2777	0	normal	SHIPPED_LIVE	gdeploy bug fix and enhancement update for RHEL7	2017-09-21 08:23:08 UTC

Description SATHEESARAN 2017-06-23 08:08:10 UTC

Description of problem:
-----------------------
The default behaviour of the any section is that to ignore errors by default. If user wishes **not** to proceed with gdeploy execution post any failure, he have to add this line ignore_*_errors=yes

But [volume] section now stops execution, when encountering a failure, which is against the expectation

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
gdeploy-2.0.2-11.el7rhgs

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Trigger rebalance on a volume

Actual results:
---------------
The script tries to start the volume, as the volume is already started, it fails to proceed further and doesn't triggers rebalance

Expected results:
-----------------
As long as ignore_volume_errors=no is mentioned, volume start failure should be ignored, and rebalance should get triggered


Additional info:
----------------

Here is the exact config file:
[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start

This was working well with gdeploy-2.0.1-13.el7rhgs

Comment 2 Sachidananda Urs 2017-06-23 09:58:03 UTC

(In reply to SATHEESARAN from comment #0)
> Description of problem:
> -----------------------
> The default behaviour of the any section is that to ignore errors by
> default. If user wishes **not** to proceed with gdeploy execution post any
> failure, he have to add this line ignore_*_errors=yes


sas, the default is not to ignore errors. And gdeploy to stop soon after it 
encounters errors. We changed to this behavior after a debate.

Ref: https://github.com/gluster/gdeploy/blob/master/gdeploylib/helpers.py#L467

            # Exit gdeploy in case of errors and user has explicitly set
            # not to ignore errors
            if retcode != 0 and Global.ignore_errors != 'yes':
                self.cleanup_and_quit(1)


However, the scenario in this bug is a special case. And we will handle that.

Comment 3 Sachidananda Urs 2017-06-23 10:18:17 UTC

sas, for now we have two ways to bypass this:

1. set force=yes in the config file.

[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start
force=yes

2. Set the ignore errors to yes.

[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start
ignore_volume_errors=yes

Comment 4 SATHEESARAN 2017-06-23 10:20:39 UTC

(In reply to Sachidananda Urs from comment #3)
> sas, for now we have two ways to bypass this:
> 
> 1. set force=yes in the config file.
> 
> [hosts]
> host1.example.com
> 
> [volume]
> action=rebalance
> volname=vol1
> state=start
> force=yes
> 
> 2. Set the ignore errors to yes.
> 
> [hosts]
> host1.example.com
> 
> [volume]
> action=rebalance
> volname=vol1
> state=start
> ignore_volume_errors=yes

Thanks Sac, I have tested the both and both worked good

Comment 5 Sachidananda Urs 2017-06-23 10:37:28 UTC

Commit: https://github.com/gluster/gdeploy/commit/79dd754358 fixes the issue

Comment 8 SATHEESARAN 2017-07-10 13:52:22 UTC

Tested with gdeploy-2.0.2-12.el7rhgs.

Performance rebalance after add-brick operation using the following conf file:


[hosts]
host1.example.com
host2.example.com
host3.example.com

[volume1]
action=add-brick
volname=vmstore
bricks=/gluster/brick2/b2

[volume2]
action=rebalance
state=start
volname=vmstore

Rebalance was triggered on the volume successfully

Comment 10 errata-xmlrpc 2017-09-21 04:49:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2777

Note You need to log in before you can comment on or make changes to this bug.