Created attachment 617438 [details] glusterd log file from server3 Description of problem: ----------------------- In a Distribute-Replicate Volume (2x2) with 4 servers and 1 brick on each server and when 2 servers forming a replicate pair (replicate-0) is powered off and subsequently executing "gluster volume status" command on server3 reports operation failed. Version-Release number of selected component (if applicable): ------------------------------------------------------------- glusterfs 3.3.0rhs built on Sep 10 2012 00:49:11 (glusterfs-server-3.3.0rhs-28.el6rhs.x86_64) How reproducible: ------------------ Often Steps to Reproduce: ------------------ 1. create a distribute-replicate (2x2) volumes. 4 server nodes and one brick on each server. 2. start the volume 3. execute : "gluster volume status <vol_name>". This should output all the bricks process information, self-heal daemon process and nfs server process of the volume <vol_name> 4. power off server1 and server2. 5. From server3 or server4 execute "gluster volume status <vol_name>" Actual results: --------------- [09/26/12 - 03:13:49 root@gqac031 ~]# gluster v status operation failed Failed to get names of volumes Expected results: ---------------- Should output brick process , self-heal daemon process and nfs server process information running on server3 and server4 Output of the commands execution on server1, server2, server3 and server4: ------------------------------------------------------------------------- Server1:- ---------- [root@gqac010 ~]# gluster v create dstore1 replica 2 `hostname`:/home/export100 gqac011.sbu.lab.eng.bos.redhat.com:/home/export100 gqac031.sbu.lab.eng.bos.redhat.com:/home/export100 gqac032.sbu.lab.eng.bos.redhat.com:/home/export100 Creation of volume dstore1 has been successful. Please start the volume to access data. [root@gqac010 ~]# gluster v start dstore1 Starting volume dstore1 has been successful [root@gqac010 ~]# gluster v status Volume dstore is not started Status of volume: dstore1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 Y 14610 Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 Y 17490 Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 Y 30510 Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 Y 32558 NFS Server on localhost 38467 Y 14616 Self-heal Daemon on localhost N/A Y 14622 NFS Server on 10.16.157.90 38467 Y 30516 Self-heal Daemon on 10.16.157.90 N/A Y 30522 NFS Server on 10.16.157.30 38467 Y 17496 Self-heal Daemon on 10.16.157.30 N/A Y 17502 NFS Server on 10.16.157.93 38467 Y 32564 Self-heal Daemon on 10.16.157.93 N/A Y 32571 [root@gqac010 ~]# kill -KILL 14610 [root@gqac010 ~]# gluster v status Volume dstore is not started Status of volume: dstore1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick gqac010.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 N 14610 Brick gqac011.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 N 17490 Brick gqac031.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 Y 30510 Brick gqac032.sbu.lab.eng.bos.redhat.com:/home/export10 0 24012 Y 32558 NFS Server on localhost 38467 Y 14616 Self-heal Daemon on localhost N/A Y 14622 NFS Server on 10.16.157.30 38467 Y 17496 Self-heal Daemon on 10.16.157.30 N/A Y 17502 NFS Server on 10.16.157.90 38467 Y 30516 Self-heal Daemon on 10.16.157.90 N/A Y 30522 NFS Server on 10.16.157.93 38467 Y 32564 Self-heal Daemon on 10.16.157.93 N/A Y 32571 [root@gqac010 ~]# poweroff Broadcast message from root.lab.eng.bos.redhat.com (/dev/pts/0) at 3:08 ... The system is going down for power off NOW! [root@gqac010 ~]# Connection to 10.16.157.27 closed by remote host. Connection to 10.16.157.27 closed. [shwetha@Shwetha-Laptop ~]$ ssh root.157.27 ssh: connect to host 10.16.157.27 port 22: Connection timed out Server2 :- --------- [09/26/12 - 03:07:24 root@gqac011 ~]# kill -KILL 17490 [09/26/12 - 03:07:41 root@gqac011 ~]# poweroff Broadcast message from root.lab.eng.bos.redhat.com (/dev/pts/0) at 3:08 ... The system is going down for power off NOW! [09/26/12 - 03:08:10 root@gqac011 ~]# Connection to 10.16.157.30 closed by remote host. Connection to 10.16.157.30 closed. Server3:- --------- [09/26/12 - 03:08:03 root@gqac031 ~]# gluster v status ^C [09/26/12 - 03:08:56 root@gqac031 ~]# gluster v status operation failed Failed to get names of volumes
*** Bug 861539 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 852147 ***