Bug 1254514
| Summary: | gstatus: Status message doesn;t show the storage node name which is down | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Anil Shah <ashah> |
| Component: | gstatus | Assignee: | Sachidananda Urs <surs> |
| Status: | CLOSED ERRATA | QA Contact: | Anil Shah <ashah> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.1 | CC: | asrivast, byarlaga, surs, vagarwal |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | RHGS 3.1.1 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | gstatus-0.65-1 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-10-05 07:23:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1251815 | ||
After discussions with Anil it was decided that we remove the self heal information, and include the number of nodes that are down/up. Sample output 1: Status Messages - Cluster is UNHEALTHY - One of the nodes in the cluster is down - Brick 10.70.47.129:/gluster/brick1 in volume 'glustervol' is down/unavailable - INFO -> Not all bricks are online, so capacity provided is NOT accurate Sample output 2: Status Messages - Cluster is UNHEALTHY - Volume 'glustervol' is in a PARTIAL state, some data is inaccessible data, due to missing bricks - WARNING -> Write requests may fail against volume 'glustervol' - 2 nodes in the cluster are down - Brick 10.70.46.185:/gluster/brick1 in volume 'glustervol' is down/unavailable - Brick 10.70.47.129:/gluster/brick1 in volume 'glustervol' is down/unavailable - INFO -> Not all bricks are online, so capacity provided is NOT accurate [root@rhs-client46 yum.repos.d]# gstatus -a
Product: RHGS Server v3.1 Capacity: 2.70 TiB(raw bricks)
Status: UNHEALTHY(4) 67.00 MiB(raw used)
Glusterfs: 3.7.1 2.70 TiB(usable from volumes)
OverCommit: No Snapshots: 0
Nodes : 2/ 4 Volumes: 0 Up
Self Heal : 2/ 4 1 Up(Degraded)
Bricks : 2/ 4 0 Up(Partial)
Connections : 4/ 16 0 Down
Volume Information
vol0 UP(DEGRADED) - 2/4 bricks up - Distributed-Replicate
Capacity: (0% used) 67.00 MiB/2.70 TiB (used/total)
Snapshots: 0
Self Heal: 2/ 4
Tasks Active: None
Protocols: glusterfs:on NFS:on SMB:on
Gluster Connectivty: 4 hosts, 16 tcp connections
Status Messages
- Cluster is UNHEALTHY
- 2 nodes in the cluster are down
- Brick 10.70.36.71:/rhs/brick1/b02 in volume 'vol0' is down/unavailable
- Brick 10.70.36.46:/rhs/brick1/b03 in volume 'vol0' is down/unavailable
- INFO -> Not all bricks are online, so capacity provided is NOT accurate
Bug verified on build glusterfs-3.7.1-14.el7rhgs.x86_64
[root@rhs-client46 yum.repos.d]# gstatus --version
gstatus 0.65
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1845.html |
Description of problem: when one of the storage node of cluster is down, running gstatus command doesn't name of the node which is down in Status message. Version-Release number of selected component (if applicable): [root@localhost ~]# gstatus --version gstatus 0.64 [root@localhost ~]# rpm -qa | grep glusterfs glusterfs-api-3.7.1-11.el7rhgs.x86_64 glusterfs-cli-3.7.1-11.el7rhgs.x86_64 glusterfs-libs-3.7.1-11.el7rhgs.x86_64 glusterfs-client-xlators-3.7.1-11.el7rhgs.x86_64 glusterfs-server-3.7.1-11.el7rhgs.x86_64 glusterfs-rdma-3.7.1-11.el7rhgs.x86_64 glusterfs-3.7.1-11.el7rhgs.x86_64 glusterfs-fuse-3.7.1-11.el7rhgs.x86_64 glusterfs-geo-replication-3.7.1-11.el7rhgs.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create 6X2 distribute replicate volume 2. Mount volume as FUSE mount on client 3. bring down one of the storage node. check gstatus. e.g gstatus -a Actual results: status message doesn't show the name of the storage node which is down. [root@knightandday ~]# gstatus -a Product: RHGS vserver3.1 Capacity: 119.00 GiB(raw bricks) Status: UNHEALTHY(13) 198.00 MiB(raw used) Glusterfs: 3.7.1 50.00 GiB(usable from volumes) OverCommit: Yes Snapshots: 1 Nodes : 2/ 4 Volumes: 0 Up Self Heal : 2/ 4 0 Up(Degraded) Bricks : 6/ 12 1 Up(Partial) Connections : 0/ 0 0 Down Volume Information testvol UP(PARTIAL) - 6/12 bricks up - Distributed-Replicate Capacity: (0% used) 99.00 MiB/50.00 GiB (used/total) Snapshots: 1 Self Heal: 6/12 Tasks Active: None Protocols: glusterfs:on NFS:on SMB:on Gluster Connectivty: 0 hosts, 0 tcp connections Status Messages - Cluster is UNHEALTHY - Volume 'testvol' is in a PARTIAL state, some data is inaccessible data, due to missing bricks - WARNING -> Write requests may fail against volume 'testvol' - Cluster node '' is down - Self heal daemon is down on - Cluster node '' is down - Self heal daemon is down on - Brick 10.70.47.3:/rhs/brick3/b12 in volume 'testvol' is down/unavailable - Brick 10.70.47.2:/rhs/brick3/b11 in volume 'testvol' is down/unavailable - Brick 10.70.47.3:/rhs/brick2/b8 in volume 'testvol' is down/unavailable - Brick 10.70.47.2:/rhs/brick2/b7 in volume 'testvol' is down/unavailable - Brick 10.70.47.2:/rhs/brick1/b3 in volume 'testvol' is down/unavailable - Brick 10.70.47.3:/rhs/brick1/b4 in volume 'testvol' is down/unavailable - INFO -> Not all bricks are online, so capacity provided is NOT accurate Expected results: Status message should display the name of the storage node which is down Additional info: