Bug 1042808

Summary: [RHSC] After failure of a remove-brick task on a volume, attempts to start subsequent tasks on the volume fail.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Shruti Sampat <ssampat>
Component: rhscAssignee: Alok <asrivast>
Status: CLOSED CANTFIX QA Contact: Anoop <annair>
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: anbabu, asriram, knarra, ltrilety, mmahoney, mmccune, rhs-bugs, sabose, sankarshan, sdharane
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhsc-3.1.0-59 Doc Type: Known Issue
Doc Text:
When remove-brick operation fails on a volume, the Red Hat Storage node does not allow any other operation on that volume. Workaround: Perform "commit" or "stop" for the failed remove-brick task, before another task can be started on the volume.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-29 15:14:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1042777, 1294033, 1351102    
Bug Blocks: 1035040, 1202842    

Description Shruti Sampat 2013-12-13 12:29:26 UTC
Description of problem:
------------------------

After the failure of remove-brick on a volume, due to brick processes being killed, all subsequent attempts to start further tasks like rebalance on the volume are failing.

Version-Release number of selected component (if applicable):
Red Hat Storage Console 2.1.2-0.27.beta.el6_5

How reproducible:
Always

Steps to Reproduce:
1. Start remove-brick on a volume. 
2. While remove-brick is in progress, kill a brick process.
The remove-brick icon gets updated as failed in the UI and the drop-down menu beside the icon has the button 'Status' enabled.
3. Now try to start rebalance on the same volume.

Actual results:
Starting rebalance fails with an events log message, that rebalance could not be started. 

Expected results:
Rebalance start should have been successful as there are no running tasks on the volume.

Additional info:

Comment 1 Dusmant 2013-12-13 12:40:36 UTC
This behaviour should be documented in the Administration guide of RHSC

Comment 2 Dusmant 2013-12-16 09:20:52 UTC
Along with the work around of going to the gluster CLI and running the stop command as given in the BZ 1042777

Comment 3 Shalaka 2014-02-11 10:01:08 UTC
Please review the edited doc text and sign off.

Comment 4 Dusmant 2014-02-13 08:46:47 UTC
Made a slight correction in the doc-text. It looks fine now.

Comment 5 anmol babu 2015-04-24 06:59:07 UTC
So, the error returned by gluster is "A remove-brick task on volume remove-brick is not yet committed. Either commit or stop the remove-brick task."
But frontend is not showing this error.So I think its better to show the same error in the UI rather than we taking a call whether to commit or stop remove-brick operation

Comment 6 Shruti Sampat 2015-04-24 07:11:52 UTC
(In reply to anmol babu from comment #5)
> So, the error returned by gluster is "A remove-brick task on volume
> remove-brick is not yet committed. Either commit or stop the remove-brick
> task."
> But frontend is not showing this error.So I think its better to show the
> same error in the UI rather than we taking a call whether to commit or stop
> remove-brick operation

Sounds good to me.

Comment 8 RamaKasturi 2015-05-22 07:18:13 UTC
Please provide the fixed in version.

Comment 9 anmol babu 2015-06-03 05:40:26 UTC
This was wrongly moved to ON_QA and hence moved it back to Assigned.
The fix is in master but needs to backported.

Comment 10 Sahina Bose 2015-06-03 05:42:33 UTC
Removing FailedQA as QE din't verify this

Comment 11 RamaKasturi 2015-06-23 05:51:43 UTC
I do not see the fix which is mentioned in comment 5 ? Has this been changed?

Comment 12 RamaKasturi 2015-06-23 06:33:50 UTC
Moving this back since the expected results does not match with actual ones.

Comment 14 Lubos Trilety 2015-06-30 13:36:20 UTC
BTW Failed remove-brick task is not possible to stop, commit or retain from GUI as all those actions are disabled. So just to tell that rebalance cannot be run till stop or commit will be done doesn't help much.

Comment 17 Sahina Bose 2018-01-29 15:14:02 UTC
Thank you for your report. This bug is filed against a component for which no further new development is being undertaken