Bug 1027675 - [RHSC] Remove-brick status dialog hangs when glusterd goes down on the storage node
Summary: [RHSC] Remove-brick status dialog hangs when glusterd goes down on the storag...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhsc
Version: 2.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 2.1.2
Assignee: anmol babu
QA Contact: Shruti Sampat
URL:
Whiteboard:
Depends On: 1028325 1036564
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-07 09:32 UTC by Shruti Sampat
Modified: 2015-05-13 16:28 UTC (History)
8 users (show)

Fixed In Version: cb10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-25 08:01:40 UTC
Embargoed:


Attachments (Terms of Use)
engine logs (3.73 MB, text/x-log)
2013-11-07 09:34 UTC, Shruti Sampat
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:0208 0 normal SHIPPED_LIVE Red Hat Storage 2.1 enhancement and bug fix update #2 2014-02-25 12:20:30 UTC
oVirt gerrit 21086 0 None None None Never

Description Shruti Sampat 2013-11-07 09:32:31 UTC
Description of problem:
-----------------------

On a cluster with a single node, when remove-brick is in progress, and glusterd is brought down on the storage node, the remove-brick status dialog hangs. The engine logs show an exception that says there is no UP server in the cluster.

Version-Release number of selected component (if applicable):
Red Hat Storage Console Version: 2.1.2-0.22.master.el6_4 

How reproducible:
Always

Steps to Reproduce:
1. Create a cluster with a single node and create a volume, start it, mount it and create some data on the mount point.
2. Start remove-brick on the volume.
3. Kill glusterd on the storage node.
4. Click on status in the remove-brick drop-down menu in the Activities column.

Actual results:
Remove-brick status dialog hangs.

Expected results:
If there is no UP server in the cluster, on clicking the remove-brick status, the dialog should give an appropriate message that status cannot be fetched as there are no UP servers. The dialog should not hang.

Additional info:

Comment 1 Shruti Sampat 2013-11-07 09:34:44 UTC
Created attachment 820952 [details]
engine logs

Comment 3 Shruti Sampat 2013-11-26 13:03:38 UTC
Remove-brick status dialog was seen to hang on following the steps listed below - 

1. On a cluster of 4 nodes, kill glusterd on one of the servers ( say server1 ), and check status, the dialog does not hang and renders correctly.

2. Kill glusterd on another server ( say server2 ), and bring it up on server1.
Check the status dialog, it hangs sometimes. At other times, it shows a message that data could not be fetched for the remove brick operation.

Re-assigning the bug.

Comment 4 anmol babu 2013-11-26 14:22:17 UTC
We had seen a similar issue some time back for Bz.1015394.

So, if you can reproduce this issue can you please execute the remove brick status command with "--xml" in Gluster CLI and check if it displays any output. Bcoz,we had seen this issue of no xml output from gluster CLI for rebalance status(Bz 1015394 which is blocked by Bz 1028325).

Comment 5 Shruti Sampat 2013-11-27 07:26:52 UTC
Saw that the remove-brick status xml returns null, even though remove-brick status command returns the status. So this is also dependent on BZ #1028325.

Comment 6 Shruti Sampat 2013-12-13 12:37:32 UTC
Verified as fixed in Red Hat Storage Console Version: 2.1.2-0.27.beta.el6_5 and glusterfs 3.4.0.49rhs. Status dialog no longer hanging, proper status displayed.

Comment 8 errata-xmlrpc 2014-02-25 08:01:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0208.html


Note You need to log in before you can comment on or make changes to this bug.