Description of problem: stopping a remove brick from cli causes the "?" unknown icon change to commit pending when the status dialog is opened. Version-Release number of selected component (if applicable): rhsc-2.1.2-0.35.el6rhs.noarch How reproducible: Always Steps to Reproduce: 1. create a distribute volume. 2. start remove brick on the volume. 3. wait till the data migration is completed. 4. now go to any of the rhs node and stop remove brick by using the command "gluster volume remove-brick <volName> <brickname> stop" 5. Now remove brick icon in the activities column changes to "?" 6. Now click on the status from the drop down. Actual results: 1. Remove brick icon in the activities column changes back to "commit pending state". 2. Task in the tasks pane shows "Removing Bricks from Gluster Volume vol1 in Cluster test_1.( MIGRATION COMPLETE Files [scanned: 891, moved: 291, failed: 0, Total size moved: 26.49 GB)" with an "x" mark. Expected results: Remove brick icon should remain as "?" unknown in the activities column and the task in the task pane should show a "?" symbol with UNKNOWN. Additional info:
Attaching the sos reports for the same. http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1061725/
Cause : Whenever we get the status from the UI, the task status is also updated to be in sync with that of the dialog. In this case, the task status returns "Completed", and not "Aborted" . However since the job's been marked aborted, the message should not have been updated in the engine.
Added this bug to the tracker bug of Corbett -- as known issue.
Please review the edited Doc Text and sign off.
Doc text looks ok. But please note "Stop remove brick" is tech preview feature. May want to add a note in doc text or remove this from known issues
Updated the doc text as suggested in comment 5.
I could see that the '?' does not change back to commit pending icon. But when i click on the status button from the drop down it says "Could not fetch remove brick status of volume : <vol_name>". I could see two tasks getting listed in the tasks pane for the same operation. Is this the expected behavior? Can you please confirm.
Stopping remove brick from CLI on one volume and opening the status dialog of remove-brick causes the remove-brick icon to come in the activities column of all the other volumes present in the cluster.
Apologize that Unfortunately, the need info flag has been changed to '-' while modifying another flag. Turning it on back again. I could see that the '?' does not change back to commit pending icon. But when i click on the status button from the drop down it says "Could not fetch remove brick status of volume : <vol_name>". Is this the expected behavior? Can you please confirm.
Regarding comment 7 - seeing the task icon for all volumes - this is a bug and a regression introduced when fixing the bug 1064295. The query that was used now associates the same taskid for all volumes. This is fixed with patch - http://gerrit.ovirt.org/28094 Regarding comment 9 - "Could not fetch remove brick status" after stopping remove brick. After stopping remove brick from CLI, fetching status from gluster CLI gives a message like this: # gluster volume remove-brick dist-vol 10.70.43.53:/bricks/b3 status volume remove-brick status: failed: remove-brick not started. Is this what you observed as well?
Yes sahina, i did observe this. This is because of the bug fix, https://bugzilla.redhat.com/show_bug.cgi?id=1089668
Please open a separate BZ for the change in behaviour of status once rebalance/remove-brick is stopped. Will need to triage and see how to handle that. Have fixed the regression issue reported in comment 7, so moving this to POST
Verified and works fine with "rhsc-3.0.0-0.7.master.el6_5.noarch". Once remove-brick is in migration complete state, stopping it from cli results in the icon to change to "UNKNOWN" after 10 minutes. And the task in the task pane is updated as "Removing Bricks from Gluster Volume vol_dis_rep in Cluster cluster_test" with a 'x' mark and when you drill down it says "Removing Bricks from Gluster Volume vol_dis_rep in Cluster cluster_test.( UNKNOWN )" '?' symbol. As per discussion with sahina, 'Introducing a minimum duration of 10 minutes before any task is cleared from engine database due to gluster CLI not returning task information"
Hi Sahina, Please review the edited doc text for technical accuracy and sign off.
+1, Looks good.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1277.html
Something to add to what is already said in comment #14, stopping remove-brick from CLI regardless of whether data-migration is complete or in progress, causes the UI icon to change to unknown after 10 minutes.