Bug 1464352 - volume section is not ignoring errors by default while triggering rebalance
volume section is not ignoring errors by default while triggering rebalance
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gdeploy (Show other bugs)
3.3
x86_64 Linux
unspecified Severity medium
: ---
: RHGS 3.3.0
Assigned To: Sachidananda Urs
SATHEESARAN
3.3.0-devel-freeze-exception
:
Depends On:
Blocks: 1417151
  Show dependency treegraph
 
Reported: 2017-06-23 04:08 EDT by SATHEESARAN
Modified: 2017-09-21 00:49 EDT (History)
6 users (show)

See Also:
Fixed In Version: gdeploy-2.0.2-12
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-09-21 00:49:50 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description SATHEESARAN 2017-06-23 04:08:10 EDT
Description of problem:
-----------------------
The default behaviour of the any section is that to ignore errors by default. If user wishes **not** to proceed with gdeploy execution post any failure, he have to add this line ignore_*_errors=yes

But [volume] section now stops execution, when encountering a failure, which is against the expectation

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
gdeploy-2.0.2-11.el7rhgs

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Trigger rebalance on a volume

Actual results:
---------------
The script tries to start the volume, as the volume is already started, it fails to proceed further and doesn't triggers rebalance

Expected results:
-----------------
As long as ignore_volume_errors=no is mentioned, volume start failure should be ignored, and rebalance should get triggered


Additional info:
----------------

Here is the exact config file:
[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start

This was working well with gdeploy-2.0.1-13.el7rhgs
Comment 2 Sachidananda Urs 2017-06-23 05:58:03 EDT
(In reply to SATHEESARAN from comment #0)
> Description of problem:
> -----------------------
> The default behaviour of the any section is that to ignore errors by
> default. If user wishes **not** to proceed with gdeploy execution post any
> failure, he have to add this line ignore_*_errors=yes


sas, the default is not to ignore errors. And gdeploy to stop soon after it 
encounters errors. We changed to this behavior after a debate.

Ref: https://github.com/gluster/gdeploy/blob/master/gdeploylib/helpers.py#L467

            # Exit gdeploy in case of errors and user has explicitly set
            # not to ignore errors
            if retcode != 0 and Global.ignore_errors != 'yes':
                self.cleanup_and_quit(1)


However, the scenario in this bug is a special case. And we will handle that.
Comment 3 Sachidananda Urs 2017-06-23 06:18:17 EDT
sas, for now we have two ways to bypass this:

1. set force=yes in the config file.

[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start
force=yes

2. Set the ignore errors to yes.

[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start
ignore_volume_errors=yes
Comment 4 SATHEESARAN 2017-06-23 06:20:39 EDT
(In reply to Sachidananda Urs from comment #3)
> sas, for now we have two ways to bypass this:
> 
> 1. set force=yes in the config file.
> 
> [hosts]
> host1.example.com
> 
> [volume]
> action=rebalance
> volname=vol1
> state=start
> force=yes
> 
> 2. Set the ignore errors to yes.
> 
> [hosts]
> host1.example.com
> 
> [volume]
> action=rebalance
> volname=vol1
> state=start
> ignore_volume_errors=yes

Thanks Sac, I have tested the both and both worked good
Comment 5 Sachidananda Urs 2017-06-23 06:37:28 EDT
Commit: https://github.com/gluster/gdeploy/commit/79dd754358 fixes the issue
Comment 8 SATHEESARAN 2017-07-10 09:52:22 EDT
Tested with gdeploy-2.0.2-12.el7rhgs.

Performance rebalance after add-brick operation using the following conf file:


[hosts]
host1.example.com
host2.example.com
host3.example.com

[volume1]
action=add-brick
volname=vmstore
bricks=/gluster/brick2/b2

[volume2]
action=rebalance
state=start
volname=vmstore

Rebalance was triggered on the volume successfully
Comment 10 errata-xmlrpc 2017-09-21 00:49:50 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2777

Note You need to log in before you can comment on or make changes to this bug.