Bug 860568

Summary:

"gluster volume status" fails to report the volume status information when 2 nodes (forming a replica pair) in a dis-rep volume is powered off

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

spandura

Component:

glusterd

Assignee:

Kaushal <kaushal>

Status:

CLOSED DUPLICATE

QA Contact:

spandura

Severity:

high

Docs Contact:

Priority:

high

Version:

2.0

CC:

amarts, rhs-bugs, sac, shaines, vbellur

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2012-11-29 08:46:54 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
glusterd log file from server3	none

Description spandura 2012-09-26 07:52:50 UTC

Created attachment 617438 [details]
glusterd log file from server3

Description of problem:
-----------------------
In a Distribute-Replicate Volume (2x2) with 4 servers and 1 brick on each server and when 2 servers forming a replicate pair (replicate-0) is powered off and subsequently executing "gluster volume status" command on server3 reports operation failed. 

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs 3.3.0rhs built on Sep 10 2012 00:49:11
(glusterfs-server-3.3.0rhs-28.el6rhs.x86_64)


How reproducible:
------------------
Often

Steps to Reproduce:
------------------
1. create a distribute-replicate (2x2) volumes. 4 server nodes and one brick on each server.
2. start the volume
3. execute : "gluster volume status <vol_name>". This should output all the bricks process information, self-heal daemon process and nfs server process of the volume <vol_name> 
4. power off server1 and server2.
5. From server3 or server4 execute "gluster volume status <vol_name>"

Actual results:
---------------
[09/26/12 - 03:13:49 root@gqac031 ~]# gluster v status
operation failed
 
Failed to get names of volumes

Expected results:
----------------
Should output brick process , self-heal daemon process and nfs server process information running on server3 and server4 

Output of the commands execution on server1, server2, server3 and server4:
-------------------------------------------------------------------------

Server1:-
----------
[root@gqac010 ~]# gluster v create dstore1 replica 2 `hostname`:/home/export100 gqac011.sbu.lab.eng.bos.redhat.com:/home/export100 gqac031.sbu.lab.eng.bos.redhat.com:/home/export100 gqac032.sbu.lab.eng.bos.redhat.com:/home/export100
Creation of volume dstore1 has been successful. Please start the volume to access data.

[root@gqac010 ~]# gluster v start dstore1
Starting volume dstore1 has been successful

[root@gqac010 ~]# gluster v status
Volume dstore is not started
 
Status of volume: dstore1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    14610
Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    17490
Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    30510
Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    32558
NFS Server on localhost                    38467    Y    14616
Self-heal Daemon on localhost                N/A    Y    14622
NFS Server on 10.16.157.90                38467    Y    30516
Self-heal Daemon on 10.16.157.90            N/A    Y    30522
NFS Server on 10.16.157.30                38467    Y    17496
Self-heal Daemon on 10.16.157.30            N/A    Y    17502
NFS Server on 10.16.157.93                38467    Y    32564
Self-heal Daemon on 10.16.157.93            N/A    Y    32571
 
[root@gqac010 ~]# kill -KILL 14610

[root@gqac010 ~]# gluster v status
Volume dstore is not started
 
Status of volume: dstore1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    N    14610
Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    N    17490
Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    30510
Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10
0                            24012    Y    32558
NFS Server on localhost                    38467    Y    14616
Self-heal Daemon on localhost                N/A    Y    14622
NFS Server on 10.16.157.30                38467    Y    17496
Self-heal Daemon on 10.16.157.30            N/A    Y    17502
NFS Server on 10.16.157.90                38467    Y    30516
Self-heal Daemon on 10.16.157.90            N/A    Y    30522
NFS Server on 10.16.157.93                38467    Y    32564
Self-heal Daemon on 10.16.157.93            N/A    Y    32571
 
[root@gqac010 ~]# poweroff

Broadcast message from root.lab.eng.bos.redhat.com
    (/dev/pts/0) at 3:08 ...

The system is going down for power off NOW!
[root@gqac010 ~]# Connection to 10.16.157.27 closed by remote host.
Connection to 10.16.157.27 closed.
[shwetha@Shwetha-Laptop ~]$ ssh root.157.27
ssh: connect to host 10.16.157.27 port 22: Connection timed out


Server2 :-
---------
[09/26/12 - 03:07:24 root@gqac011 ~]# kill -KILL 17490
[09/26/12 - 03:07:41 root@gqac011 ~]# poweroff

Broadcast message from root.lab.eng.bos.redhat.com
    (/dev/pts/0) at 3:08 ...

The system is going down for power off NOW!
[09/26/12 - 03:08:10 root@gqac011 ~]# Connection to 10.16.157.30 closed by remote host.
Connection to 10.16.157.30 closed.


Server3:-
---------
[09/26/12 - 03:08:03 root@gqac031 ~]# gluster v status
^C
[09/26/12 - 03:08:56 root@gqac031 ~]# gluster v status
operation failed
 
Failed to get names of volumes

Comment 2 Kaushal 2012-11-26 06:44:02 UTC

*** Bug 861539 has been marked as a duplicate of this bug. ***

Comment 3 Amar Tumballi 2012-11-29 08:46:54 UTC


*** This bug has been marked as a duplicate of bug 852147 ***