860568 – "gluster volume status" fails to report the volume status information when 2 nodes (forming a replica pair) in a dis-rep volume is powered off

Bug 860568 - "gluster volume status" fails to report the volume status information when 2 nodes (forming a replica pair) in a dis-rep volume is powered off

Summary: "gluster volume status" fails to report the volume status information when 2 ...

Keywords:
Status:	CLOSED DUPLICATE of bug 852147
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Kaushal
QA Contact:	spandura
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	861539 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-09-26 07:52 UTC by spandura
Modified:	2012-11-29 08:46 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-11-29 08:46:54 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
glusterd log file from server3 (122.47 KB, text/x-log) 2012-09-26 07:52 UTC, spandura	no flags	Details
View All

Description spandura 2012-09-26 07:52:50 UTC

Created attachment 617438 [details]
glusterd log file from server3

Description of problem:
-----------------------
In a Distribute-Replicate Volume (2x2) with 4 servers and 1 brick on each server and when 2 servers forming a replicate pair (replicate-0) is powered off and subsequently executing "gluster volume status" command on server3 reports operation failed. 

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs 3.3.0rhs built on Sep 10 2012 00:49:11
(glusterfs-server-3.3.0rhs-28.el6rhs.x86_64)


How reproducible:
------------------
Often

Steps to Reproduce:
------------------
1. create a distribute-replicate (2x2) volumes. 4 server nodes and one brick on each server.
2. start the volume
3. execute : "gluster volume status <vol_name>". This should output all the bricks process information, self-heal daemon process and nfs server process of the volume <vol_name> 
4. power off server1 and server2.
5. From server3 or server4 execute "gluster volume status <vol_name>"

Actual results:
---------------
[09/26/12 - 03:13:49 root@gqac031 ~]# gluster v status
operation failed
 
Failed to get names of volumes

Expected results:
----------------
Should output brick process , self-heal daemon process and nfs server process information running on server3 and server4 

Output of the commands execution on server1, server2, server3 and server4:
-------------------------------------------------------------------------

Server1:-
----------
[root@gqac010 ~]# gluster v create dstore1 replica 2 `hostname`:/home/export100 gqac011.sbu.lab.eng.bos.redhat.com:/home/export100 gqac031.sbu.lab.eng.bos.redhat.com:/home/export100 gqac032.sbu.lab.eng.bos.redhat.com:/home/export100
Creation of volume dstore1 has been successful. Please start the volume to access data.

[root@gqac010 ~]# gluster v start dstore1
Starting volume dstore1 has been successful

[root@gqac010 ~]# gluster v status
Volume dstore is not started
 
Status of volume: dstore1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    14610
Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    17490
Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    30510
Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    32558
NFS Server on localhost                    38467    Y    14616
Self-heal Daemon on localhost                N/A    Y    14622
NFS Server on 10.16.157.90                38467    Y    30516
Self-heal Daemon on 10.16.157.90            N/A    Y    30522
NFS Server on 10.16.157.30                38467    Y    17496
Self-heal Daemon on 10.16.157.30            N/A    Y    17502
NFS Server on 10.16.157.93                38467    Y    32564
Self-heal Daemon on 10.16.157.93            N/A    Y    32571
 
[root@gqac010 ~]# kill -KILL 14610

[root@gqac010 ~]# gluster v status
Volume dstore is not started
 
Status of volume: dstore1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    N    14610
Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    N    17490
Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    30510
Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    32558
NFS Server on localhost                    38467    Y    14616
Self-heal Daemon on localhost                N/A    Y    14622
NFS Server on 10.16.157.30                38467    Y    17496
Self-heal Daemon on 10.16.157.30            N/A    Y    17502
NFS Server on 10.16.157.90                38467    Y    30516
Self-heal Daemon on 10.16.157.90            N/A    Y    30522
NFS Server on 10.16.157.93                38467    Y    32564
Self-heal Daemon on 10.16.157.93            N/A    Y    32571
 
[root@gqac010 ~]# poweroff

Broadcast message from root.lab.eng.bos.redhat.com
    (/dev/pts/0) at 3:08 ...

The system is going down for power off NOW!
[root@gqac010 ~]# Connection to 10.16.157.27 closed by remote host.
Connection to 10.16.157.27 closed.
[shwetha@Shwetha-Laptop ~]$ ssh root.157.27
ssh: connect to host 10.16.157.27 port 22: Connection timed out


Server2 :-
---------
[09/26/12 - 03:07:24 root@gqac011 ~]# kill -KILL 17490
[09/26/12 - 03:07:41 root@gqac011 ~]# poweroff

Broadcast message from root.lab.eng.bos.redhat.com
    (/dev/pts/0) at 3:08 ...

The system is going down for power off NOW!
[09/26/12 - 03:08:10 root@gqac011 ~]# Connection to 10.16.157.30 closed by remote host.
Connection to 10.16.157.30 closed.


Server3:-
---------
[09/26/12 - 03:08:03 root@gqac031 ~]# gluster v status
^C
[09/26/12 - 03:08:56 root@gqac031 ~]# gluster v status
operation failed
 
Failed to get names of volumes

Comment 2 Kaushal 2012-11-26 06:44:02 UTC

*** Bug 861539 has been marked as a duplicate of this bug. ***

Comment 3 Amar Tumballi 2012-11-29 08:46:54 UTC


*** This bug has been marked as a duplicate of bug 852147 ***

Note You need to log in before you can comment on or make changes to this bug.