Bug 860568

Summary: "gluster volume status" fails to report the volume status information when 2 nodes (forming a replica pair) in a dis-rep volume is powered off
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: spandura
Component: glusterdAssignee: Kaushal <kaushal>
Status: CLOSED DUPLICATE QA Contact: spandura
Severity: high Docs Contact:
Priority: high    
Version: 2.0CC: amarts, rhs-bugs, sac, shaines, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-29 08:46:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
glusterd log file from server3 none

Description spandura 2012-09-26 07:52:50 UTC
Created attachment 617438 [details]
glusterd log file from server3

Description of problem:
-----------------------
In a Distribute-Replicate Volume (2x2) with 4 servers and 1 brick on each server and when 2 servers forming a replicate pair (replicate-0) is powered off and subsequently executing "gluster volume status" command on server3 reports operation failed. 

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs 3.3.0rhs built on Sep 10 2012 00:49:11
(glusterfs-server-3.3.0rhs-28.el6rhs.x86_64)


How reproducible:
------------------
Often

Steps to Reproduce:
------------------
1. create a distribute-replicate (2x2) volumes. 4 server nodes and one brick on each server.
2. start the volume
3. execute : "gluster volume status <vol_name>". This should output all the bricks process information, self-heal daemon process and nfs server process of the volume <vol_name> 
4. power off server1 and server2.
5. From server3 or server4 execute "gluster volume status <vol_name>"

Actual results:
---------------
[09/26/12 - 03:13:49 root@gqac031 ~]# gluster v status
operation failed
 
Failed to get names of volumes

Expected results:
----------------
Should output brick process , self-heal daemon process and nfs server process information running on server3 and server4 

Output of the commands execution on server1, server2, server3 and server4:
-------------------------------------------------------------------------

Server1:-
----------
[root@gqac010 ~]# gluster v create dstore1 replica 2 `hostname`:/home/export100 gqac011.sbu.lab.eng.bos.redhat.com:/home/export100 gqac031.sbu.lab.eng.bos.redhat.com:/home/export100 gqac032.sbu.lab.eng.bos.redhat.com:/home/export100
Creation of volume dstore1 has been successful. Please start the volume to access data.

[root@gqac010 ~]# gluster v start dstore1
Starting volume dstore1 has been successful

[root@gqac010 ~]# gluster v status
Volume dstore is not started
 
Status of volume: dstore1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    14610
Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    17490
Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    30510
Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    32558
NFS Server on localhost                    38467    Y    14616
Self-heal Daemon on localhost                N/A    Y    14622
NFS Server on 10.16.157.90                38467    Y    30516
Self-heal Daemon on 10.16.157.90            N/A    Y    30522
NFS Server on 10.16.157.30                38467    Y    17496
Self-heal Daemon on 10.16.157.30            N/A    Y    17502
NFS Server on 10.16.157.93                38467    Y    32564
Self-heal Daemon on 10.16.157.93            N/A    Y    32571
 
[root@gqac010 ~]# kill -KILL 14610

[root@gqac010 ~]# gluster v status
Volume dstore is not started
 
Status of volume: dstore1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    N    14610
Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    N    17490
Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    30510
Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    32558
NFS Server on localhost                    38467    Y    14616
Self-heal Daemon on localhost                N/A    Y    14622
NFS Server on 10.16.157.30                38467    Y    17496
Self-heal Daemon on 10.16.157.30            N/A    Y    17502
NFS Server on 10.16.157.90                38467    Y    30516
Self-heal Daemon on 10.16.157.90            N/A    Y    30522
NFS Server on 10.16.157.93                38467    Y    32564
Self-heal Daemon on 10.16.157.93            N/A    Y    32571
 
[root@gqac010 ~]# poweroff

Broadcast message from root.lab.eng.bos.redhat.com
    (/dev/pts/0) at 3:08 ...

The system is going down for power off NOW!
[root@gqac010 ~]# Connection to 10.16.157.27 closed by remote host.
Connection to 10.16.157.27 closed.
[shwetha@Shwetha-Laptop ~]$ ssh root.157.27
ssh: connect to host 10.16.157.27 port 22: Connection timed out


Server2 :-
---------
[09/26/12 - 03:07:24 root@gqac011 ~]# kill -KILL 17490
[09/26/12 - 03:07:41 root@gqac011 ~]# poweroff

Broadcast message from root.lab.eng.bos.redhat.com
    (/dev/pts/0) at 3:08 ...

The system is going down for power off NOW!
[09/26/12 - 03:08:10 root@gqac011 ~]# Connection to 10.16.157.30 closed by remote host.
Connection to 10.16.157.30 closed.


Server3:-
---------
[09/26/12 - 03:08:03 root@gqac031 ~]# gluster v status
^C
[09/26/12 - 03:08:56 root@gqac031 ~]# gluster v status
operation failed
 
Failed to get names of volumes

Comment 2 Kaushal 2012-11-26 06:44:02 UTC
*** Bug 861539 has been marked as a duplicate of this bug. ***

Comment 3 Amar Tumballi 2012-11-29 08:46:54 UTC

*** This bug has been marked as a duplicate of bug 852147 ***