1065227 – [RHSC] - Rebalance icon in the activities column changes to unknown (?)

Bug 1065227 - [RHSC] - Rebalance icon in the activities column changes to unknown (?)

Summary: [RHSC] - Rebalance icon in the activities column changes to unknown (?)

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhsc
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Sahina Bose
QA Contact:	RamaKasturi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-02-14 07:27 UTC by RamaKasturi
Modified:	2015-05-13 17:01 UTC (History)
CC List:	11 users (show)
Fixed In Version:	rhsc-3.0.0-0.7.master.el6_5
Doc Type:	Bug Fix
Doc Text:	Previously, the GlusterFS task list information would consume a considerable amount of time to synchronize with other nodes to provide consistent information about the newly created tasks. If the glusterFS task list did not return the information about a task, the task was marked as Unknown. Although the task is active, the Console would fail to monitor it. With this fix, a minimum wait time of 10 minutes is introduced before a task is cleared. As a result, the task information is displayed correctly on the Red Hat Storage Console.
Clone Of:
Environment:
Last Closed:	2014-09-22 19:07:35 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1277	0	normal	SHIPPED_LIVE	Red Hat Storage Console 3.0 enhancement and bug fix update	2014-09-22 23:06:30 UTC
oVirt gerrit	24712	0	master	MERGED	engine: Wait before clearing gluster tasks	Never

Description RamaKasturi 2014-02-14 07:27:54 UTC

Description of problem:
Rebalance icon in the activities column changes to unknown (?) during race condition.

Version-Release number of selected component (if applicable):
rhsc-2.1.2-0.36.el6rhs.noarch

How reproducible:
Intermittently

Steps to Reproduce:
1. Login to rhs console.
2. create 2 distribute and 2 distribute replicate volumes.
3. Now start rebalance on the volumes.



Actual results:
after a few sec user can see that rebalance icon in the activities column changes to unknown (?) with an event message saying "Could not find information for rebalance on volume <volName> of Cluster cluster_test_setup from CLI. Marking it as unknown

Expected results:
Rebalance icon should not be changed to "?" and it should run sucessfully.

Additional info:

Comment 1 RamaKasturi 2014-02-14 07:35:01 UTC

Attaching the sos reports for the same

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1065227/

Comment 2 RamaKasturi 2014-02-14 07:39:08 UTC

Some more additional info from the logs:

2014-02-13 18:38:22,157 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-89) [75a8cdae] START
, GlusterTasksListVDSCommand(HostName = 10.70.37.44, HostId = 74f68cf1-8053-457b-b1ee-0aac2f6e50ab), log id: 3fa47293
2014-02-13 18:38:25,303 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler_Worker-65) START, GlusterServersListVDSCommand(HostName = 10.70.37.185, HostId = 3d8aa03b-3d0f-45ff-a588-611713ccbaf6), log id: 58a81ae0
2014-02-13 18:38:25,339 INFO  [org.ovirt.engine.core.vdsbroker.gluster.StartRebalanceGlusterVolumeVDSCommand] (pool-5-thread-39) [227c84b6] FINISH, StartRebalanceGlusterVolumeVDSCommand, return: GlusterAsyncTask[673fdc6b-ef0a-4791-9dfe-d0da90adb623-null-null], log id: 74669829
2014-02-13 18:38:25,375 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-39) [227c84b6] Correlation ID: 227
c84b6, Job ID: 8f655b94-9395-4970-810b-159827706e8d, Call Stack: null, Custom Event ID: -1, Message: Gluster Volume vol_dis_one rebalance started.
2014-02-13 18:38:25,408 INFO  [org.ovirt.engine.core.bll.gluster.StartRebalanceGlusterVolumeCommand] (pool-5-thread-39) [1cecda6a] Running command: StartRebalanceGlusterVolumeCommand internal: false. Entities affected :  ID: f215adce-1799-4c38-8543-46b8ceb4da97 Type: GlusterVolume
2014-02-13 18:38:25,417 INFO  [org.ovirt.engine.core.vdsbroker.gluster.StartRebalanceGlusterVolumeVDSCommand] (pool-5-thread-39) [1cecda6a] START, StartRebalanceGlusterVolumeVDSCommand(HostName = 10.70.37.185, HostId = 3d8aa03b-3d0f-45ff-a588-611713ccbaf6), log id: 3f526e09
2014-02-13 18:38:25,808 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-89) [75a8cdae] FINISH, GlusterTasksListVDSCommand, return: [GlusterAsyncTask[673fdc6b-ef0a-4791-9dfe-d0da90adb623-REBALANCE-STARTED]], log id: 3fa47293
2014-02-13 18:38:25,883 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirect

Comment 4 Sahina Bose 2014-02-19 06:11:47 UTC

This happens when
1. GlusterTasksLists called on host1
2. StartRebalance has created a task id on host2
3. GlusterTasksList returns from host1 before being aware of the new taskid

Introducing a minimum duration of 10 minutes before any task is cleared from engine database due to gluster CLI not returning task information

Comment 5 RamaKasturi 2014-06-09 12:48:55 UTC

Verified and works fine with build

Created two distribute and distribute replicated volumes. Triggered rebalance on all the volumes at once from UI.

Rebalance completed successfully on all the volumes.

Comment 6 Pavithra 2014-07-10 07:02:26 UTC

Hi Sahina, 

Please review the edited text for technical accuracy and sign off.

Comment 7 Sahina Bose 2014-07-11 14:31:26 UTC

+1

Comment 9 errata-xmlrpc 2014-09-22 19:07:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1277.html

Note You need to log in before you can comment on or make changes to this bug.