Red Hat Bugzilla – Bug 1027675
[RHSC] Remove-brick status dialog hangs when glusterd goes down on the storage node
Last modified: 2015-05-13 12:28:31 EDT
Description of problem:
On a cluster with a single node, when remove-brick is in progress, and glusterd is brought down on the storage node, the remove-brick status dialog hangs. The engine logs show an exception that says there is no UP server in the cluster.
Version-Release number of selected component (if applicable):
Red Hat Storage Console Version: 2.1.2-0.22.master.el6_4
Steps to Reproduce:
1. Create a cluster with a single node and create a volume, start it, mount it and create some data on the mount point.
2. Start remove-brick on the volume.
3. Kill glusterd on the storage node.
4. Click on status in the remove-brick drop-down menu in the Activities column.
Remove-brick status dialog hangs.
If there is no UP server in the cluster, on clicking the remove-brick status, the dialog should give an appropriate message that status cannot be fetched as there are no UP servers. The dialog should not hang.
Created attachment 820952 [details]
Remove-brick status dialog was seen to hang on following the steps listed below -
1. On a cluster of 4 nodes, kill glusterd on one of the servers ( say server1 ), and check status, the dialog does not hang and renders correctly.
2. Kill glusterd on another server ( say server2 ), and bring it up on server1.
Check the status dialog, it hangs sometimes. At other times, it shows a message that data could not be fetched for the remove brick operation.
Re-assigning the bug.
We had seen a similar issue some time back for Bz.1015394.
So, if you can reproduce this issue can you please execute the remove brick status command with "--xml" in Gluster CLI and check if it displays any output. Bcoz,we had seen this issue of no xml output from gluster CLI for rebalance status(Bz 1015394 which is blocked by Bz 1028325).
Saw that the remove-brick status xml returns null, even though remove-brick status command returns the status. So this is also dependent on BZ #1028325.
Verified as fixed in Red Hat Storage Console Version: 2.1.2-0.27.beta.el6_5 and glusterfs 18.104.22.168rhs. Status dialog no longer hanging, proper status displayed.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.