Bug 1464352 - volume section is not ignoring errors by default while triggering rebalance
Summary: volume section is not ignoring errors by default while triggering rebalance
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gdeploy
Version: rhgs-3.3
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: RHGS 3.3.0
Assignee: Sachidananda Urs
QA Contact: SATHEESARAN
URL:
Whiteboard: 3.3.0-devel-freeze-exception
Depends On:
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-06-23 08:08 UTC by SATHEESARAN
Modified: 2017-09-21 04:49 UTC (History)
6 users (show)

Fixed In Version: gdeploy-2.0.2-12
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-21 04:49:50 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2777 0 normal SHIPPED_LIVE gdeploy bug fix and enhancement update for RHEL7 2017-09-21 08:23:08 UTC

Description SATHEESARAN 2017-06-23 08:08:10 UTC
Description of problem:
-----------------------
The default behaviour of the any section is that to ignore errors by default. If user wishes **not** to proceed with gdeploy execution post any failure, he have to add this line ignore_*_errors=yes

But [volume] section now stops execution, when encountering a failure, which is against the expectation

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
gdeploy-2.0.2-11.el7rhgs

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Trigger rebalance on a volume

Actual results:
---------------
The script tries to start the volume, as the volume is already started, it fails to proceed further and doesn't triggers rebalance

Expected results:
-----------------
As long as ignore_volume_errors=no is mentioned, volume start failure should be ignored, and rebalance should get triggered


Additional info:
----------------

Here is the exact config file:
[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start

This was working well with gdeploy-2.0.1-13.el7rhgs

Comment 2 Sachidananda Urs 2017-06-23 09:58:03 UTC
(In reply to SATHEESARAN from comment #0)
> Description of problem:
> -----------------------
> The default behaviour of the any section is that to ignore errors by
> default. If user wishes **not** to proceed with gdeploy execution post any
> failure, he have to add this line ignore_*_errors=yes


sas, the default is not to ignore errors. And gdeploy to stop soon after it 
encounters errors. We changed to this behavior after a debate.

Ref: https://github.com/gluster/gdeploy/blob/master/gdeploylib/helpers.py#L467

            # Exit gdeploy in case of errors and user has explicitly set
            # not to ignore errors
            if retcode != 0 and Global.ignore_errors != 'yes':
                self.cleanup_and_quit(1)


However, the scenario in this bug is a special case. And we will handle that.

Comment 3 Sachidananda Urs 2017-06-23 10:18:17 UTC
sas, for now we have two ways to bypass this:

1. set force=yes in the config file.

[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start
force=yes

2. Set the ignore errors to yes.

[hosts]
host1.example.com

[volume]
action=rebalance
volname=vol1
state=start
ignore_volume_errors=yes

Comment 4 SATHEESARAN 2017-06-23 10:20:39 UTC
(In reply to Sachidananda Urs from comment #3)
> sas, for now we have two ways to bypass this:
> 
> 1. set force=yes in the config file.
> 
> [hosts]
> host1.example.com
> 
> [volume]
> action=rebalance
> volname=vol1
> state=start
> force=yes
> 
> 2. Set the ignore errors to yes.
> 
> [hosts]
> host1.example.com
> 
> [volume]
> action=rebalance
> volname=vol1
> state=start
> ignore_volume_errors=yes

Thanks Sac, I have tested the both and both worked good

Comment 5 Sachidananda Urs 2017-06-23 10:37:28 UTC
Commit: https://github.com/gluster/gdeploy/commit/79dd754358 fixes the issue

Comment 8 SATHEESARAN 2017-07-10 13:52:22 UTC
Tested with gdeploy-2.0.2-12.el7rhgs.

Performance rebalance after add-brick operation using the following conf file:


[hosts]
host1.example.com
host2.example.com
host3.example.com

[volume1]
action=add-brick
volname=vmstore
bricks=/gluster/brick2/b2

[volume2]
action=rebalance
state=start
volname=vmstore

Rebalance was triggered on the volume successfully

Comment 10 errata-xmlrpc 2017-09-21 04:49:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2777


Note You need to log in before you can comment on or make changes to this bug.