Red Hat Bugzilla – Bug 961247
[RHEVM-RHS] Host status is shown up in UI, when glusterd is stopped
Last modified: 2016-02-10 13:58:48 EST
Description of problem:
Host status in gluster cluster is shown up, even when glusterd is not running.
Version-Release number of selected component (if applicable):
RHS 2.1 - glusterfs-126.96.36.199rhs-1.el6rhs.x86_64 
Steps to Reproduce:
1. Add a host to gluster cluster in RHEVM UI
2. After bootstrapping of the node, try stopping glusterd in that node
3. Check for the status of the node in RHEVM UI
The node is shown UP even when glusterd is not running
The node should be marked relevantly when glusterd is not running.
Atleast events or notification should inform that glusterd is not running
The consequence of this bug, is that RHEVM UI allows to add more than 1 host to gluster-cluster, even when glusterd is not running in both of them.[ here, UI shows both nodes are up ]
Ultimately, there is a cluster contains RHS Nodes, which are not in cluster in reality, as glusterd is not operational.
But after sometime, if glusterd comes up in one node, that automatically removes other from cluster
The periodic polling now runs a gluster peer command, to ensure glusterd is running on the node
(In reply to Sahina Bose from comment #2)
> The periodic polling now runs a gluster peer command, to ensure glusterd is
> running on the node
With RHEVM 3.3.0-0.34.beta1.el6ev and glusterfs-188.8.131.52rhs-1 -
I did the following,
1. In a 3.2 Compatibility Datacenter, created a 3.2 compatible cluster
2. Using RHEVM UI, added 2 RHSS nodes to the above created cluster.
3. Using RHEVM UI, created distribute-replicate volume(2X2) and started it.
4. From one of the RHSS Node (gluster cli), stopped glusterd
(i.e) service glusterd stop
1. From RHEVM UI, I could observe that the soon after glusterd was stopped, RHSS Node was moved to "NON-OPERATIONAL" ( it took less than 10 secs, in all three attempts )
2. But, When I started glusterd on that Nodes (service glusterd start), it took 2 minutes to show back the node as in UP state in RHEVM UI
If the polling is periodic, why there is a delay in showing the node as "UP" in rhevm UI, once glusterd is UP ? Is this expected one ?
This doesn't happen when glusterd was down.(i.e) Time take to show the NODE as non-operational, after stopping glusterd is <~10secs
For Non-Operational hosts, an auto recovery is tried every 5 minutes (by default) to try and activate the host. This is when you see the host going back to UP state.
(In reply to Sahina Bose from comment #4)
> For Non-Operational hosts, an auto recovery is tried every 5 minutes (by
> default) to try and activate the host. This is when you see the host going
> back to UP state.
Thanks for the quick response !!
With Verification steps described in Comment3, moving this bug to VERIFIED.
Closing - RHEV 3.3 Released