Bug 1022996 - [RHSC] - Monitoring stop rebalance from CLI does not work.
[RHSC] - Monitoring stop rebalance from CLI does not work.
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: rhsc (Show other bugs)
2.1
Unspecified Unspecified
high Severity urgent
: ---
: RHGS 2.1.2
Assigned To: Sahina Bose
RamaKasturi
: ZStream
Depends On: 1040303
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-24 08:31 EDT by RamaKasturi
Modified: 2015-05-13 12:27 EDT (History)
6 users (show)

See Also:
Fixed In Version: cb10
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-02-25 02:44:34 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Attaching the screenshot (188.89 KB, image/png)
2013-12-03 07:35 EST, RamaKasturi
no flags Details
Attaching the screenshot (180.83 KB, image/png)
2013-12-19 06:26 EST, RamaKasturi
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 20954 None None None Never
oVirt gerrit 21497 None None None Never

  None (edit)
Description RamaKasturi 2013-10-24 08:31:40 EDT
Description of problem:
Monitoring stop rebalance from CLI does not work.

Version-Release number of selected component (if applicable):
rhsc-2.1.2-0.21.beta1.el6_4.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a distribute volume and start it.
2. start rebalance on the volume.
3. Once rebalance is started go to CLI and run the command "gluster vol rebalance <volName> stop"

Actual results:
When rebalance is stopped from CLI, the following happens.

1) Status dialog gets status as aborted.
2) Stop Rebalance button does not get disabled.
3) Rebalance icon does not change to rebalance stopped.
4) Task pane gets hung.
5) stop button is enabled in the drop down menu of activities column.
6) stopping rebalance from the UI suceeds.

Expected results:
When rebalance is stopped from CLI, the following shoudl happen.

1) Stop Rebalance button should get disabled.
2) Rebalance icon should change to rebalance stopped.
3) Tasks pane should execute the task properly.
4) stop button should get disabled in the drop down menu of activities column.
5) stopping rebalance from UI should not suceed.

Additional info:
Comment 2 RamaKasturi 2013-10-24 09:09:34 EDT
Please find the sosreports in the following link:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1022996/
Comment 3 RamaKasturi 2013-10-24 09:10:05 EDT
Following is seen in from the gluster CLI.

[root@localhost vdsm]# gluster volume rebalance vol_dis stop
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes            50             0             0              stopped              13.00
                            10.70.37.140                0        0Bytes             1             0             0              stopped              13.00
                             10.70.37.43                2       253.0KB            24             0             0              stopped              14.00
                            10.70.37.108                0        0Bytes            73             0             0              stopped              13.00
volume rebalance: vol_dis: success: rebalance process may be in the middle of a file migration.
The process will be fully stopped once the migration of the file is complete.
Please check rebalance process for completion before doing any further brick related tasks on the volume.
[root@localhost vdsm]# gluster volume status vol_dis tasks
Task Status of Volume vol_dis
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@localhost vdsm]# gluster volume status vol_dis tasks
Task Status of Volume vol_dis
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@localhost vdsm]#
Comment 4 Dusmant 2013-10-25 02:41:16 EDT
We might require a fix from GlusterFS also, along with our fix...
Comment 5 Sahina Bose 2013-10-30 05:21:12 EDT
Will add fix in engine to end Job with status "UNKNOWN" when gluster does not return the task information.
Comment 6 Sahina Bose 2013-11-06 04:55:33 EST
Added code to end orphan tasks (tasks that gluster is no longer aware of) with status UNKNOWN.
Comment 7 Sahina Bose 2013-11-18 03:56:40 EST
Added code as per Comment 5
Comment 8 RamaKasturi 2013-11-19 01:31:19 EST
Able to reproduce the issue in cb8. steps to reproduce:

1. Create a distribute volume and start it.
2. start rebalance on the volume.
3. Once rebalance is started go to CLI and run the command "gluster vol rebalance <volName> stop"

Actual results:
When rebalance is stopped from CLI, the following happens.

1) Status dialog gets status as aborted.
2) Stop Rebalance button does not get disabled.
3) Rebalance icon does not change to rebalance stopped.
4) Task pane gets hung.
5) stop button is enabled in the drop down menu of activities column.
6) stopping rebalance from the UI suceeds.

please find the sosreports in the below link:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1022996/
Comment 9 Sahina Bose 2013-11-21 04:23:07 EST
The issue was that cleaning up of tasks is not called when there is some other cluster that returns error on getting task list from gluster. In this case, the Default cluster failed as there were no UP servers in it.

Posted a patch to make sure the task clean up is done for operational clusters.
Comment 10 RamaKasturi 2013-12-03 07:34:45 EST
As part of the fix provided, the following were the expected results.

1) Icon in the activities column should have the unknown symbol/icon with the drop down enabled.

2) Tasks pane should have the task marked as "UNKNOWN"

The following are the results seen while verifying the bug.

1) Icon in the activities column gets updated to Rebalance Stopped Icon and the task gets updated to aborted.

2) Icon in the activities column disappears, and only the drop down is present.
Task gets updated as INPROGRESS and the volume name comes as <UNKNOWN>. Attaching the screen shot for the same.

Both the above steps happens alternatively.
Comment 11 RamaKasturi 2013-12-03 07:35:21 EST
Created attachment 832037 [details]
Attaching the screenshot
Comment 12 Sahina Bose 2013-12-05 05:17:46 EST
The issue seems to be that when you stop rebalance and start it again - the second time it fails to start. The same behaviour is observed from gluster CLI.

From engine log:
2013-12-05 21:15:22,928 INFO  [org.ovirt.engine.core.vdsbroker.gluster.StartRebalanceGlusterVolumeVDSCommand] (pool-4-thread-48) [528dd7f3] FINISH, StartRebalanceGlusterVolumeVDSCommand, return: org.ovirt.engine.core.common.asynctasks.gluster.GlusterAsyncTask@67e12cdf, log id: 5e125684
2013-12-05 21:15:22,964 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-4-thread-48) [528dd7f3] Correlation ID: 528dd7f3, Job ID: 55fb6cdb-8ce3-43f6-b769-56bab47e2e37, Call Stack: null, Custom Event ID: -1, Message: Could not start Gluster Volume vol_dis rebalance.

From vdsm log:
Thread-89862::ERROR::2013-12-05 21:15:25,524::BindingXMLRPC::1000::vds::(wrapper) vdsm exception occured
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 989, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/gluster/api.py", line 53, in wrapper
    rv = func(*args, **kwargs)
  File "/usr/share/vdsm/gluster/api.py", line 125, in volumeRebalanceStart
    force)
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda>
    **kwargs)
  File "<string>", line 2, in glusterVolumeRebalanceStart
  File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
    raise convert_to_error(kind, result)
GlusterVolumeRebalanceStartFailedException: Volume rebalance start failed
error: Rebalance on vol_dis is already started
return code: -1
Comment 13 Dusmant 2013-12-09 10:48:10 EST
We have to release note this bug.
Comment 14 Sahina Bose 2013-12-10 00:58:56 EST
The issue reported in Comment 10, where the Rebalance activity has a Stopped icon is due to the error where rebalance could not be started (because the earlier stop rebalance has not completed)

This is not related to monitoring, so please log a separate bug for this so that it can be release noted.

Moving this bug to ON_QA for verification.
Comment 15 RamaKasturi 2013-12-11 02:37:39 EST
Verified in cb10. The following happens when rebalance is stopped from gluster cli.

1) In the volume activities column icon gets disappeared , mouse hovering on the icon shows the text as "unknown" and drop down with status enabled. Icon should get updated as "?". Looged a bug for the same.

https://bugzilla.redhat.com/show_bug.cgi?id=1035601

2) Tasks pane does not get updated with the correct status. Logged a bug for that.
https://bugzilla.redhat.com/show_bug.cgi?id=1040303

3) If status dialog is opened , before or after stop command is issued , activities column gets updated with rebalance stopped icon. Logged a bug for this.

https://bugzilla.redhat.com/show_bug.cgi?id=1040310
Comment 16 RamaKasturi 2013-12-11 02:38:52 EST
Will mark this bug verified only after the following bugs are fixed.

https://bugzilla.redhat.com/show_bug.cgi?id=1040303

https://bugzilla.redhat.com/show_bug.cgi?id=1040303
Comment 17 RamaKasturi 2013-12-11 03:20:25 EST
Will mark this bug verified only after the following bugs are fixed.

https://bugzilla.redhat.com/show_bug.cgi?id=1040303

https://bugzilla.redhat.com/show_bug.cgi?id=1035601
Comment 18 RamaKasturi 2013-12-19 06:21:33 EST
verified and works fine with cb12 build rhsc-2.1.2-0.28.beta.el6_5.noarch

When rebalance is stopped from CLI, an '?' icon appears in the volume activities column and tasks pane gets updated with a task with 'x' mark and expanding the task pane gives the message "Rebalancing gluster volume <volName> in cluster <clusterName> (UNKNOWN)

An event message gets displayed saying "Could not find information for rebalance on volume <volName> of Cluster <clusterName> from CLI. Marking it as unknown.

Attaching the screenshot for the same.
Comment 19 RamaKasturi 2013-12-19 06:26:32 EST
Created attachment 838935 [details]
Attaching the screenshot
Comment 22 errata-xmlrpc 2014-02-25 02:44:34 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html

Note You need to log in before you can comment on or make changes to this bug.