Bug 1508523 - cluster status is healthy if one of volumes is down and all nodes are connected
Summary: cluster status is healthy if one of volumes is down and all nodes are connected
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-gluster-integration
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Shubhendu Tripathi
QA Contact: Martin Kudlej
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-01 15:39 UTC by Martin Kudlej
Modified: 2017-12-18 04:39 UTC (History)
4 users (show)

Fixed In Version: tendrl-gluster-integration-1.5.4-2.el7rhgs
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-18 04:39:36 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:3478 normal SHIPPED_LIVE RHGS Web Administration packages 2017-12-18 09:34:49 UTC
Github https://github.com/Tendrl gluster-integration issues 468 None None None 2017-11-07 06:58:53 UTC

Description Martin Kudlej 2017-11-01 15:39:50 UTC
Description of problem:
I've looked at code for cluster status in 'tendrl/gluster_integration/sds_sync/cluster_status.py':

def sync_cluster_status(volumes):
    status = 'healthy'

    # Calculate status based on volumes status
    degraded_count = 0
    if len(volumes) > 0:
        volume_states = _derive_volume_states(volumes)
        for vol_id, state in volume_states.iteritems():
            if 'down' in state or 'partial' in state:
                status = 'unhealthy'
            if 'degraded' in state:
                degraded_count += 1

...
    # Change status basd on node status
    cmd = cmd_utils.Command(
        'gluster pool list', True
    )
    out, err, rc = cmd.run()
    peer_count = 0
    if not err:
        out_lines = out.split('\n')
        connected = True
...
        if connected:
            status = 'healthy'
        else:
            status = 'unhealthy'

As you can see cluster status can be 'healthy' if all nodes are
connected and some volumes are down. I think that if any of volumes
are down then checking of nodes can be avoided.


Version-Release number of selected component (if applicable):
etcd-3.2.5-1.el7.x86_64
glusterfs-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-api-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-cli-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-client-xlators-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-client-xlators-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-events-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-fuse-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-fuse-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-geo-replication-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-libs-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-libs-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-server-4.0dev-0.218.git614904f.el7.centos.x86_64
python2-gluster-4.0dev-0.218.git614904f.el7.centos.x86_64
python-etcd-0.4.5-1.noarch
rubygem-etcd-0.3.0-1.el7.centos.noarch
tendrl-ansible-1.5.3-20171016T154931.c64462a.noarch
tendrl-api-1.5.3-20171013T082716.a2f3b3f.noarch
tendrl-api-httpd-1.5.3-20171013T082716.a2f3b3f.noarch
tendrl-commons-1.5.3-20171017T094335.9050aa7.noarch
tendrl-gluster-integration-1.5.3-20171013T082052.b8ddae5.noarch
tendrl-grafana-plugins-1.5.3-20171016T100950.e8eb6c8.noarch
tendrl-grafana-selinux-1.5.3-20171013T090621.ffb1b7f.noarch
tendrl-monitoring-integration-1.5.3-20171016T100950.e8eb6c8.noarch
tendrl-node-agent-1.5.3-20171016T094453.4aa81f7.noarch
tendrl-notifier-1.5.3-20171011T200310.3c01717.noarch
tendrl-selinux-1.5.3-20171013T090621.ffb1b7f.noarch
tendrl-ui-1.5.3-20171016T141509.544015a.noarch

Comment 1 Petr Penicka 2017-11-08 14:09:03 UTC
Triage Nov 8: Dev and QE agree to have in 3.3.1 release.

Comment 3 Martin Kudlej 2017-11-16 13:24:32 UTC
I've done code inspection and it seems to me OK. 

tendrl-gluster-integration-1.5.4-2.el7rhgs.noarch

--> VERIFIED

Comment 5 errata-xmlrpc 2017-12-18 04:39:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478


Note You need to log in before you can comment on or make changes to this bug.