Bug 1015045

Summary: [RHSC] - Rebalance icon in the activities column always shows rebalance is in progress even if rebalance fails.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: RamaKasturi <knarra>
Component: rhscAssignee: Ramesh N <rnachimu>
Status: CLOSED NOTABUG QA Contact: RamaKasturi <knarra>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: dpati, dtsang, knarra, mmahoney, pprakash, rhs-bugs, sabose, sdharane, ssampat
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 2.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cb11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-18 11:39:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 979376, 1023921, 1028325, 1036464, 1036564    
Bug Blocks:    
Attachments:
Description Flags
Attaching engine log
none
Attaching vdsm log
none
Attaching vdsm node2 log
none
Attaching vdsm node3 log
none
Attaching vdsm node4 logs none

Description RamaKasturi 2013-10-03 11:20:58 UTC
Created attachment 807010 [details]
Attaching engine log

Description of problem:
Rebalance icon in the activities column always shows rebalance is in progress when rebalance fails

Version-Release number of selected component (if applicable):
rhsc-2.1.1-0.0.2.master.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a distributed volume.
2. Now go to any of the node and stop glusterd.
3. Now come to UI and start rebalance on the volume.
4. Click on the status rebalance button and it shows rebalance status as failed.

Actual results:
Rebalance icon still shows that rebalance is in progressa and task is running in the tasks pane.

Expected results:
Rebalance icon should show rebalance failed and the corresponding icon. Task in the task pane should be aborted or failed and an event message should be generated.

Additional info:

Comment 1 RamaKasturi 2013-10-03 11:21:35 UTC
Created attachment 807011 [details]
Attaching vdsm log

Comment 2 RamaKasturi 2013-10-03 11:22:32 UTC
Created attachment 807012 [details]
Attaching vdsm node2 log

Comment 3 RamaKasturi 2013-10-03 11:23:06 UTC
Created attachment 807013 [details]
Attaching vdsm node3 log

Comment 4 RamaKasturi 2013-10-03 11:24:03 UTC
Created attachment 807014 [details]
Attaching vdsm node4 logs

Comment 5 RamaKasturi 2013-10-03 11:27:13 UTC
When glusterd  is brought up in the node , then activities column is getting updated with rebalance failed icon and in the status dialog the node on which glusterd is brought up shows the status as UNKNOWN.

Comment 7 Ramesh N 2013-10-28 09:00:34 UTC
I am not able to reproduce this issue with latest RPMs. If I stop the glusterd service in any host and start the rebalance, activity column icon gets updated properly. It needs at least few minutes to update the task entry in task pane and 
update the activity column.

Comment 8 Dustin Tsang 2013-10-28 15:07:19 UTC
same issue as comment#0 in rhsc-cb5.

Comment 9 Dustin Tsang 2013-10-28 15:15:30 UTC
the activity icon takes at least 20 seconds to change after clicking 'refresh' multiple times even though the status dialog shows failed immediately.

Comment 10 Ramesh N 2013-10-30 04:33:49 UTC
Its the expected behaviour. Its a design decision that async task status will be synced once in a minute from host. Hence, it will take maximum 1 minute for the engine to know the task status. UI refresh after the task sync only will change the activity icon. Simply refreshing the UI will not change anything.

Comment 11 RamaKasturi 2013-11-07 08:48:13 UTC
I am still able to reproduce the issue. The following are the steps.

1) Add 4 servers to the console.
2) Now create a volume with bricks from all the servers and start it.
3) Mount the volume and create some data in it.
4) Now go to server1 and stop glusterd.
5) In the console , make sure that host goes to non operational and start rebalance on the volume.
6) Now icon in the activities column gets updated after 5 mins to failed.
7) Now bring back glusterd in server1.

8) Now go to server2 and stop glusterd.
9) Wait till the host becomes non responsive in the console.
10) Now start rebalance on the volume.
11) Rebalance icon in the activities column always shows rebalance running icon and clicking on the status always gives "could not fetch data".
12) Not even able to stop rebalance on that volume. It says could not stop rebalance.

Attaching sos reports here

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1015045/

Comment 12 RamaKasturi 2013-12-16 11:56:36 UTC
This bug cannot be verified because of the following bug fix 

https://bugzilla.redhat.com/show_bug.cgi?id=1021441

Comment 13 Sahina Bose 2013-12-18 11:39:12 UTC
Since due to fix mentioned in Comment 12, this bug is not reproducible, I'm closing this as not a bug.