1508523 – cluster status is healthy if one of volumes is down and all nodes are connected

Bug 1508523 - cluster status is healthy if one of volumes is down and all nodes are connected

Summary: cluster status is healthy if one of volumes is down and all nodes are connected

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	web-admin-tendrl-gluster-integration
Sub Component:
Version:	rhgs-3.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Shubhendu Tripathi
QA Contact:	Martin Kudlej
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-11-01 15:39 UTC by Martin Kudlej
Modified:	2017-12-18 04:39 UTC (History)
CC List:	4 users (show)
Fixed In Version:	tendrl-gluster-integration-1.5.4-2.el7rhgs
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-12-18 04:39:36 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	https://github.com/Tendrl gluster-integration issues 468	0	None	None	None	2017-11-07 06:58:53 UTC
Red Hat Product Errata	RHEA-2017:3478	0	normal	SHIPPED_LIVE	RHGS Web Administration packages	2017-12-18 09:34:49 UTC

Description Martin Kudlej 2017-11-01 15:39:50 UTC

Description of problem:
I've looked at code for cluster status in 'tendrl/gluster_integration/sds_sync/cluster_status.py':

def sync_cluster_status(volumes):
    status = 'healthy'

    # Calculate status based on volumes status
    degraded_count = 0
    if len(volumes) > 0:
        volume_states = _derive_volume_states(volumes)
        for vol_id, state in volume_states.iteritems():
            if 'down' in state or 'partial' in state:
                status = 'unhealthy'
            if 'degraded' in state:
                degraded_count += 1

...
    # Change status basd on node status
    cmd = cmd_utils.Command(
        'gluster pool list', True
    )
    out, err, rc = cmd.run()
    peer_count = 0
    if not err:
        out_lines = out.split('\n')
        connected = True
...
        if connected:
            status = 'healthy'
        else:
            status = 'unhealthy'

As you can see cluster status can be 'healthy' if all nodes are
connected and some volumes are down. I think that if any of volumes
are down then checking of nodes can be avoided.


Version-Release number of selected component (if applicable):
etcd-3.2.5-1.el7.x86_64
glusterfs-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-api-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-cli-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-client-xlators-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-client-xlators-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-events-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-fuse-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-fuse-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-geo-replication-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-libs-4.0dev-0.213.git09f6ae2.el7.centos.x86_64
glusterfs-libs-4.0dev-0.218.git614904f.el7.centos.x86_64
glusterfs-server-4.0dev-0.218.git614904f.el7.centos.x86_64
python2-gluster-4.0dev-0.218.git614904f.el7.centos.x86_64
python-etcd-0.4.5-1.noarch
rubygem-etcd-0.3.0-1.el7.centos.noarch
tendrl-ansible-1.5.3-20171016T154931.c64462a.noarch
tendrl-api-1.5.3-20171013T082716.a2f3b3f.noarch
tendrl-api-httpd-1.5.3-20171013T082716.a2f3b3f.noarch
tendrl-commons-1.5.3-20171017T094335.9050aa7.noarch
tendrl-gluster-integration-1.5.3-20171013T082052.b8ddae5.noarch
tendrl-grafana-plugins-1.5.3-20171016T100950.e8eb6c8.noarch
tendrl-grafana-selinux-1.5.3-20171013T090621.ffb1b7f.noarch
tendrl-monitoring-integration-1.5.3-20171016T100950.e8eb6c8.noarch
tendrl-node-agent-1.5.3-20171016T094453.4aa81f7.noarch
tendrl-notifier-1.5.3-20171011T200310.3c01717.noarch
tendrl-selinux-1.5.3-20171013T090621.ffb1b7f.noarch
tendrl-ui-1.5.3-20171016T141509.544015a.noarch

Comment 1 Petr Penicka 2017-11-08 14:09:03 UTC

Triage Nov 8: Dev and QE agree to have in 3.3.1 release.

Comment 3 Martin Kudlej 2017-11-16 13:24:32 UTC

I've done code inspection and it seems to me OK. 

tendrl-gluster-integration-1.5.4-2.el7rhgs.noarch

--> VERIFIED

Comment 5 errata-xmlrpc 2017-12-18 04:39:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478

Note You need to log in before you can comment on or make changes to this bug.