Bug 866758
Summary: | gluster volume status all "Failed to get names of volumes" when peer in volume is restarted during transaction | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Scott Haines <shaines> |
Component: | glusterfs | Assignee: | Amar Tumballi <amarts> |
Status: | CLOSED ERRATA | QA Contact: | spandura |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 2.0 | CC: | cww, kristof.wevers, ksquizza, ndevos, rhs-bugs, shaines, spandura, ujjwala, vbellur, vinaraya, vraman |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.3.0.3rhs-33.el6rhs | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 858333 | Environment: | |
Last Closed: | 2012-11-12 18:47:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 858333 | ||
Bug Blocks: |
Description
Scott Haines
2012-10-16 04:38:46 UTC
Verified the bug by executing the steps given to recreate the problem. The bug doesn't exist anymore. For 10-13 minutes after powering off one of the server, the execution of gluster cli command fails with error message "operation failed". After >10 minutes of powering off the machines, the execution of gluster cli commands are successful. Servers command execution output: ================================= [root@darrel ~]# gluster --version glusterfs 3.3.0.3rhs built on Oct 10 2012 09:16:20 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. [root@darrel ~]# uname -a Linux darrel.lab.eng.blr.redhat.com 2.6.32-220.28.1.el6.x86_64 #1 SMP Wed Oct 3 12:26:28 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux Server1:- ============= [root@darrel ~]# service glusterd start Starting glusterd: [ OK ] [root@darrel ~]# service glusterd status glusterd (pid 2811) is running... [root@darrel ~]# hostname darrel.lab.eng.blr.redhat.com [root@darrel ~]# gluster peer probe king.lab.eng.blr.redhat.com Probe successful [root@darrel ~]# gluster peer status Number of Peers: 1 Hostname: king.lab.eng.blr.redhat.com Port: 24007 Uuid: 0f7403e2-86dd-4347-b168-5181f4ff1c31 State: Peer in Cluster (Connected) [root@darrel ~]# gluster volume create rep replica 2 darrel.lab.eng.blr.redhat.com:/home/export1 king.lab.eng.blr.redhat.com:/home/export1 Creation of volume rep has been successful. Please start the volume to access data. [root@darrel ~]# gluster v info rep Volume Name: rep Type: Replicate Volume ID: 665bf1a7-4289-471f-9647-e1144cd1242d Status: Created Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: darrel.lab.eng.blr.redhat.com:/home/export1 Brick2: king.lab.eng.blr.redhat.com:/home/export1 [root@darrel ~]# gluster v start rep Starting volume rep has been successful [root@darrel ~]# gluster v status rep Status of volume: rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick darrel.lab.eng.blr.redhat.com:/home/export1 24009 Y 2915 Brick king.lab.eng.blr.redhat.com:/home/export1 24009 Y 2879 NFS Server on localhost 38467 Y 2920 Self-heal Daemon on localhost N/A Y 2926 NFS Server on king.lab.eng.blr.redhat.com 38467 Y 2884 Self-heal Daemon on king.lab.eng.blr.redhat.com N/A Y 2891 [root@darrel ~]# gluster v status rep Unable to obtain volume status information. [root@darrel ~]# gluster v status all Status of volume: rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick darrel.lab.eng.blr.redhat.com:/home/export1 24009 Y 2915 Brick king.lab.eng.blr.redhat.com:/home/export1 24009 Y 1583 NFS Server on localhost 38467 Y 2920 Self-heal Daemon on localhost N/A Y 2926 NFS Server on king.lab.eng.blr.redhat.com 38467 Y 1588 Self-heal Daemon on king.lab.eng.blr.redhat.com N/A Y 1594 Server2:- =============== [root@king ~]# gluster v status Status of volume: rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick darrel.lab.eng.blr.redhat.com:/home/export1 24009 Y 2915 Brick king.lab.eng.blr.redhat.com:/home/export1 24009 Y 1583 NFS Server on localhost 38467 Y 1588 Self-heal Daemon on localhost N/A Y 1594 NFS Server on 10.70.34.115 38467 Y 2920 Self-heal Daemon on 10.70.34.115 N/A Y 2926 [root@king ~]# poweroff Broadcast message from root.eng.blr.redhat.com (/dev/pts/0) at 23:56 ... The system is going down for power off NOW! Server1:- ============ [root@darrel ~]# gluster v status [root@darrel ~]# echo $? 130 [root@darrel ~]# gluster v status all operation failed Failed to get names of volumes [root@darrel ~]# gluster v heal rep info operation failed [root@darrel ~]# gluster v set rep stat-prefetch off [root@darrel ~]# echo $? 255 [root@darrel ~]# gluster v status all operation failed Failed to get names of volumes After 13 minutes on server1:- ========================== [root@darrel ~]# gluster v status all Status of volume: rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick darrel.lab.eng.blr.redhat.com:/home/export1 24009 Y 2915 NFS Server on localhost 38467 Y 2920 Self-heal Daemon on localhost N/A Y 2926 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1456.html |