Bug 1254514 - gstatus: Status message doesn;t show the storage node name which is down
gstatus: Status message doesn;t show the storage node name which is down
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gstatus (Show other bugs)
3.1
x86_64 Linux
unspecified Severity urgent
: ---
: RHGS 3.1.1
Assigned To: Sachidananda Urs
Anil Shah
: ZStream
Depends On:
Blocks: 1251815
  Show dependency treegraph
 
Reported: 2015-08-18 06:04 EDT by Anil Shah
Modified: 2015-10-05 03:23 EDT (History)
4 users (show)

See Also:
Fixed In Version: gstatus-0.65-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-05 03:23:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1845 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.1 update 2015-10-05 07:06:22 EDT

  None (edit)
Description Anil Shah 2015-08-18 06:04:44 EDT
Description of problem:

when one of the storage node of cluster is down, running gstatus command doesn't name of the node which is down in Status message.

Version-Release number of selected component (if applicable):

[root@localhost ~]# gstatus --version
gstatus 0.64

[root@localhost ~]# rpm -qa | grep glusterfs
glusterfs-api-3.7.1-11.el7rhgs.x86_64
glusterfs-cli-3.7.1-11.el7rhgs.x86_64
glusterfs-libs-3.7.1-11.el7rhgs.x86_64
glusterfs-client-xlators-3.7.1-11.el7rhgs.x86_64
glusterfs-server-3.7.1-11.el7rhgs.x86_64
glusterfs-rdma-3.7.1-11.el7rhgs.x86_64
glusterfs-3.7.1-11.el7rhgs.x86_64
glusterfs-fuse-3.7.1-11.el7rhgs.x86_64
glusterfs-geo-replication-3.7.1-11.el7rhgs.x86_64


How reproducible:

100%

Steps to Reproduce:

1. Create 6X2 distribute replicate volume
2. Mount volume as FUSE mount on client 
3. bring down one of the storage node. check gstatus. e.g gstatus -a

Actual results:
status message doesn't show the name of the storage node which is down.

[root@knightandday ~]# gstatus -a
 
     Product: RHGS vserver3.1    Capacity: 119.00 GiB(raw bricks)
      Status: UNHEALTHY(13)                198.00 MiB(raw used)
   Glusterfs: 3.7.1                         50.00 GiB(usable from volumes)
  OverCommit: Yes               Snapshots:   1

   Nodes       :  2/  4		  Volumes:   0 Up
   Self Heal   :  2/  4		             0 Up(Degraded)
   Bricks      :  6/ 12		             1 Up(Partial)
   Connections :  0/   0                     0 Down

Volume Information
	testvol          UP(PARTIAL) - 6/12 bricks up - Distributed-Replicate
	                 Capacity: (0% used) 99.00 MiB/50.00 GiB (used/total)
	                 Snapshots: 1
	                 Self Heal:  6/12
	                 Tasks Active: None
	                 Protocols: glusterfs:on  NFS:on  SMB:on
	                 Gluster Connectivty: 0 hosts, 0 tcp connections


Status Messages
  - Cluster is UNHEALTHY
  - Volume 'testvol' is in a PARTIAL state, some data is inaccessible data, due to missing bricks
  - WARNING -> Write requests may fail against volume 'testvol'
  - Cluster node '' is down
  - Self heal daemon is down on 
  - Cluster node '' is down
  - Self heal daemon is down on 
  - Brick 10.70.47.3:/rhs/brick3/b12 in volume 'testvol' is down/unavailable
  - Brick 10.70.47.2:/rhs/brick3/b11 in volume 'testvol' is down/unavailable
  - Brick 10.70.47.3:/rhs/brick2/b8 in volume 'testvol' is down/unavailable
  - Brick 10.70.47.2:/rhs/brick2/b7 in volume 'testvol' is down/unavailable
  - Brick 10.70.47.2:/rhs/brick1/b3 in volume 'testvol' is down/unavailable
  - Brick 10.70.47.3:/rhs/brick1/b4 in volume 'testvol' is down/unavailable
  - INFO -> Not all bricks are online, so capacity provided is NOT accurate



Expected results:

Status message should display the name of the storage node which is down

Additional info:
Comment 3 Sachidananda Urs 2015-08-27 03:26:20 EDT
After discussions with Anil it was decided that we remove the self heal information, and include the number of nodes that are down/up.

Sample output 1:

Status Messages
  - Cluster is UNHEALTHY
  - One of the nodes in the cluster is down
  - Brick 10.70.47.129:/gluster/brick1 in volume 'glustervol' is down/unavailable
  - INFO -> Not all bricks are online, so capacity provided is NOT accurate


Sample output 2:

Status Messages
  - Cluster is UNHEALTHY
  - Volume 'glustervol' is in a PARTIAL state, some data is inaccessible data, due to missing bricks
  - WARNING -> Write requests may fail against volume 'glustervol'
  - 2 nodes in the cluster are down
  - Brick 10.70.46.185:/gluster/brick1 in volume 'glustervol' is down/unavailable
  - Brick 10.70.47.129:/gluster/brick1 in volume 'glustervol' is down/unavailable
  - INFO -> Not all bricks are online, so capacity provided is NOT accurate
Comment 4 Anil Shah 2015-09-02 08:32:17 EDT
[root@rhs-client46 yum.repos.d]# gstatus -a
 
     Product: RHGS Server v3.1   Capacity:   2.70 TiB(raw bricks)
      Status: UNHEALTHY(4)                  67.00 MiB(raw used)
   Glusterfs: 3.7.1                          2.70 TiB(usable from volumes)
  OverCommit: No                Snapshots:   0

   Nodes       :  2/  4		  Volumes:   0 Up
   Self Heal   :  2/  4		             1 Up(Degraded)
   Bricks      :  2/  4		             0 Up(Partial)
   Connections :  4/  16                     0 Down

Volume Information
	vol0             UP(DEGRADED) - 2/4 bricks up - Distributed-Replicate
	                 Capacity: (0% used) 67.00 MiB/2.70 TiB (used/total)
	                 Snapshots: 0
	                 Self Heal:  2/ 4
	                 Tasks Active: None
	                 Protocols: glusterfs:on  NFS:on  SMB:on
	                 Gluster Connectivty: 4 hosts, 16 tcp connections


Status Messages
  - Cluster is UNHEALTHY
  - 2 nodes in the cluster are down
  - Brick 10.70.36.71:/rhs/brick1/b02 in volume 'vol0' is down/unavailable
  - Brick 10.70.36.46:/rhs/brick1/b03 in volume 'vol0' is down/unavailable
  - INFO -> Not all bricks are online, so capacity provided is NOT accurate


Bug verified on build glusterfs-3.7.1-14.el7rhgs.x86_64

[root@rhs-client46 yum.repos.d]# gstatus --version
gstatus 0.65
Comment 6 errata-xmlrpc 2015-10-05 03:23:49 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1845.html

Note You need to log in before you can comment on or make changes to this bug.