Bug 1061725 - [RHSC] - stopping remove brick from CLI after data migration is completed causes remove-brick icon changes from ?(unknown) to commit pending when the status dialog is opened.
Summary: [RHSC] - stopping remove brick from CLI after data migration is completed cau...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhsc
Version: 2.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.0.0
Assignee: Sahina Bose
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks: 1035040
TreeView+ depends on / blocked
 
Reported: 2014-02-05 13:51 UTC by RamaKasturi
Modified: 2015-05-13 17:01 UTC (History)
15 users (show)

Fixed In Version: rhsc-3.0.0-0.7.master.el6_5
Doc Type: Bug Fix
Doc Text:
Previously if the status dialog box was open and simultaneously a remove-brick was stopped from CLI, the task was displayed as Commit Pending because the status dialog box would return the status as Completed. This resulted in an incorrect status message on the Console. With this fix, the Status Dialog box displays the correct status for a stop remove-brick operation.
Clone Of:
Environment:
Last Closed: 2014-09-22 19:07:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1277 0 normal SHIPPED_LIVE Red Hat Storage Console 3.0 enhancement and bug fix update 2014-09-22 23:06:30 UTC
oVirt gerrit 24151 0 master MERGED engine: Update gluster task's info only if job in progress Never
oVirt gerrit 28094 0 master MERGED engine: Fixed query to return task for volume Never

Description RamaKasturi 2014-02-05 13:51:09 UTC
Description of problem:
stopping a remove brick from cli causes the "?" unknown icon change to  commit pending when the status dialog is opened.

Version-Release number of selected component (if applicable):
rhsc-2.1.2-0.35.el6rhs.noarch

How reproducible:
Always

Steps to Reproduce:
1. create a distribute volume.
2. start remove brick on the volume.
3. wait till the data migration is completed.
4. now go to any of the rhs node and stop remove brick by using the command "gluster volume remove-brick <volName> <brickname> stop"
5. Now remove brick icon in the activities column changes to "?"
6. Now click on the status from the drop down.


Actual results:
1. Remove brick icon in the activities column changes back to "commit pending state".

2. Task in the tasks pane shows "Removing Bricks from Gluster Volume vol1 in Cluster test_1.( MIGRATION COMPLETE Files [scanned: 891, moved: 291, failed: 0, Total size moved: 26.49 GB)" with an "x" mark.

Expected results:
Remove brick icon should remain as "?" unknown in the activities column and the task in the task pane should show a "?" symbol with UNKNOWN.

Additional info:

Comment 1 RamaKasturi 2014-02-05 13:55:50 UTC
Attaching the sos reports for the same.

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1061725/

Comment 2 Sahina Bose 2014-02-06 09:27:51 UTC
Cause : Whenever we get the status from the UI, the task status is also updated to be in sync with that of the dialog. In this case, the task status returns "Completed", and not "Aborted" . However since the job's been marked aborted, the message should not have been updated in the engine.

Comment 3 Dusmant 2014-02-10 09:30:53 UTC
Added this bug to the tracker bug of Corbett -- as known issue.

Comment 4 Shalaka 2014-02-11 10:18:07 UTC
Please review the edited Doc Text and sign off.

Comment 5 Sahina Bose 2014-02-18 08:05:57 UTC
Doc text looks ok. But please note "Stop remove brick" is tech preview feature. May want to add a note in doc text or remove this from known issues

Comment 6 Shalaka 2014-02-19 09:24:02 UTC
Updated the doc text as suggested in comment 5.

Comment 7 RamaKasturi 2014-05-24 08:47:12 UTC
I could see that the '?' does not change back to commit pending icon.

But when i click on the status button from the drop down it says "Could not fetch remove brick status of volume : <vol_name>".

I could see two tasks getting listed in the tasks pane for the same operation.

Is this the expected behavior? Can you please confirm.

Comment 8 RamaKasturi 2014-05-24 09:45:01 UTC
Stopping remove brick from CLI on one volume and opening the status dialog of remove-brick causes the remove-brick icon to come in the activities column of all the other volumes present in the cluster.

Comment 9 RamaKasturi 2014-05-24 10:16:03 UTC
Apologize that Unfortunately, the need info flag has been changed to '-' while modifying another flag. Turning it on back again.

I could see that the '?' does not change back to commit pending icon.

But when i click on the status button from the drop down it says "Could not fetch remove brick status of volume : <vol_name>".

Is this the expected behavior? Can you please confirm.

Comment 10 Sahina Bose 2014-05-26 06:45:36 UTC
Regarding comment 7 - seeing the task icon for all volumes - this is a bug and a regression introduced when fixing the bug 1064295. The query that was used now associates the same taskid for all volumes. 
This is fixed with patch - http://gerrit.ovirt.org/28094

Regarding comment 9 - "Could not fetch remove brick status" after stopping remove brick.
After stopping remove brick from CLI, fetching status from gluster CLI gives a message like this:
# gluster volume remove-brick dist-vol 10.70.43.53:/bricks/b3 status
volume remove-brick status: failed: remove-brick not started.

Is this what you observed as well?

Comment 11 RamaKasturi 2014-05-26 08:33:16 UTC
Yes sahina, i did observe this.

This is because of the bug fix, https://bugzilla.redhat.com/show_bug.cgi?id=1089668

Comment 12 Sahina Bose 2014-05-26 09:31:10 UTC
Please open a separate BZ for the change in behaviour of status once rebalance/remove-brick is stopped. Will need to triage and see how to handle that.

Have fixed the regression issue reported in comment 7, so moving this to POST

Comment 14 RamaKasturi 2014-06-09 09:15:41 UTC
Verified and works fine with "rhsc-3.0.0-0.7.master.el6_5.noarch".

Once remove-brick is in migration complete state, stopping it from cli results in the icon to change to "UNKNOWN" after 10 minutes.

And the task in the task pane is updated as "Removing Bricks from Gluster Volume vol_dis_rep in Cluster cluster_test" with a 'x' mark and when you drill down it says "Removing Bricks from Gluster Volume vol_dis_rep in Cluster cluster_test.( UNKNOWN  )" '?' symbol.

As per discussion with sahina, 'Introducing a minimum duration of 10 minutes before any task is cleared from engine database due to gluster CLI not returning task information"

Comment 15 Pavithra 2014-07-10 08:51:40 UTC
Hi Sahina,

Please review the edited doc text for technical accuracy and sign off.

Comment 16 Sahina Bose 2014-07-11 14:32:28 UTC
+1, Looks good.

Comment 18 errata-xmlrpc 2014-09-22 19:07:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1277.html

Comment 19 Shruti Sampat 2014-09-23 06:43:30 UTC
Something to add to what is already said in comment #14, stopping remove-brick from CLI regardless of whether data-migration is complete or in progress, causes the UI icon to change to unknown after 10 minutes.


Note You need to log in before you can comment on or make changes to this bug.