Bug 1618770 - Description of Brick Status panel of At-a-Glance section on Volume dashboard is missing list of all brick status codes with explanation
Summary: Description of Brick Status panel of At-a-Glance section on Volume dashboard ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: web-admin-tendrl-monitoring-integration
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: gowtham
QA Contact: sds-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks: 1613526
TreeView+ depends on / blocked
 
Reported: 2018-08-17 14:36 UTC by Martin Bukatovic
Modified: 2019-05-08 18:09 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-08 16:10:52 UTC
Embargoed:


Attachments (Terms of Use)
screenshot 1: Description of Brick Status panel without status code "3" (232.41 KB, image/png)
2018-08-17 14:36 UTC, Martin Bukatovic
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1614000 0 unspecified CLOSED "Bricks" and "Brick Status" panels on Host and Volume dashboards doesn't use the same definition of brick state 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1614506 0 unspecified CLOSED Status panels for Volumes, Hosts and Bricks on various dashboards uses number codes instead of human readable status cod... 2021-02-22 00:41:40 UTC

Internal Links: 1614000 1614506

Description Martin Bukatovic 2018-08-17 14:36:45 UTC
Created attachment 1476638 [details]
screenshot 1: Description of Brick Status panel without status code "3"

Description of problem
======================

Description of "Brick Status" panel of At-a-Glance section of Volume dashboard
states:

> The Brick Status panel displays the status code of each brick for a given
> volume.
>
> 0 = Started
> 8 = Stopped

But I noticed that the panel reports also status code 3, which is not covered
in the description.

Version-Release number of selected component
============================================

tendrl-monitoring-integration-1.6.3-10.el7rhgs.noarch

Steps to Reproduce
==================

To see the description of Brick Status panel, you need to just install WA
and import any cluster with at least one volume.

But to get the table there actually report status code 3, I don't have a
100% reproducer. The following it's mere description of what I was doing when
I noticed this:

1. Instal RHGS WA using tendrl-ansible
2. Make sure you have only 2 GiB of ram on storage nodes
3. Import Trusted storage pool with at least one volume, profiling enabled
4. Run test_setup.wiki_tarball.yml based workload from dedicated client machine:
5. Let it running for few days

Actual results
==============

There is no description of brick status code 3.

See screenshot #1.

Expected results
================

Description of brick status code 3 is provided.

Additional info
===============

Code inspection for all possible status code is necessary to check if we are
not missing other status codes which could be reported here.

Comment 1 Martin Bukatovic 2018-08-17 14:39:32 UTC
It's also possible that status code 3 should not be reported here at all and it's
a bug, but it's not clear which resolution is correct without further analysis.

Comment 3 gowtham 2018-09-05 12:56:51 UTC
Please provide get-state output from a node which has affected brick. and also helpful if we have brick detail from etcd.

Comment 4 Martin Bukatovic 2018-09-06 18:38:34 UTC
I no longer have the machines to immediately provide the details here.

Comment 5 Martin Bukatovic 2018-09-06 18:45:11 UTC
Atin, could you provide us a reference of complete list of brick states? This
BZ shows that there is at least one more state WA dashboard is not aware about,
and I would like to make sure the fix of this BZ incorporates all possible
states gluster brick could be in.

Comment 7 Martin Bukatovic 2018-09-06 19:06:21 UTC
Since we are not sure about full list of brick states, this BZ blocks BZ 1613526.

Comment 8 Atin Mukherjee 2018-09-07 10:04:00 UTC
GlusterD maintains 4 different states of brick status which is described below:

typedef enum gf_brick_status {                                                      
        GF_BRICK_STOPPED,                                                           
        GF_BRICK_STARTED,                                                           
        GF_BRICK_STOPPING,                                                          
        GF_BRICK_STARTING                                                           
} gf_brick_status_t;  

Point to note is GF_BRICK_STOPPING & GF_BRICK_STARTING are the intermediate stages from STARTED->STOPPED and vice versa which do not get reflected in the output. So technically for an user there's STARTED & STOPPED. However I'm not sure what the status code is represented as in the bug description. I don't think a brickinfo->status can have any value greater than 1 in the get-state output.

Comment 9 Martin Bukatovic 2019-01-24 20:20:05 UTC
(In reply to gowtham from comment #3)
> Please provide get-state output from a node which has affected brick. and
> also helpful if we have brick detail from etcd.

I noticed this again and figured where the problem is: the value displayed is
some sort of average of 0 or 8 values for the selected time range. You can get
any value between 0 and 8 as well, if the circumstances are right.

So when you have brick started for some time and the you stop it and wait again,
you can get 4 as a status if you select time range which contains both states,
but will get 0 if you select only recent time range when the
brick was stopped.

The description should be updated to include polished explanation and reasoning
for this behavior.

I'm not creating new ENG BZ for this, as the correct solution here is to
address BZ 1505769 and use different plugin for this panel, which will allow
to display human readable status based on given enumeration (up, down in this
case).


Note You need to log in before you can comment on or make changes to this bug.