Bug 902646 - Quorum: sometimes doesn't start the glusterfs servers
Summary: Quorum: sometimes doesn't start the glusterfs servers
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: 2.0
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: Sachidananda Urs
URL:
Whiteboard: glusterd
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-22 07:13 UTC by Sachidananda Urs
Modified: 2015-03-23 07:40 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)
sosreport for the failed instance (3.81 MB, application/x-xz)
2013-01-22 07:15 UTC, Sachidananda Urs
no flags Details

Description Sachidananda Urs 2013-01-22 07:13:49 UTC
Description of problem:

When the quorum kicks in, it sometimes doesn't kill the servers. And sometimes it doesn't start the killed servers. The behavior is totally random.

Mentioning below the steps to reproduce the same.

Enable quorum: gluster volume set dist cluster.server-quorum-type server
Set quorum ratio: gluster volume set all  cluster.server-quorum-ratio 80

On one of the servers, disable all network traffic except that of ssh, this is done so that we have the ssh access to the machine.

-----------------
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT
iptables -A INPUT -j DROP; iptables -A OUTPUT -j DROP
-----------------

Check the glusterfsd processes on all the machines, on some machines the glusterfsd processes are not killed.

Now restore the network settings...

iptables -F

At this point the expectation is all the killed servers are brought back up. This doesn't happen sometimes.

The following steps have to be repeated a few times to notice the behavior, this  behaviour is a bit random.

Comment 2 Sachidananda Urs 2013-01-22 07:15:31 UTC
Created attachment 684900 [details]
sosreport for the failed instance

Comment 3 Sachidananda Urs 2013-01-22 09:15:06 UTC
This is tested on update 4. And the installed rpms:


glusterfs-fuse-3.3.0.5rhs-40.el6rhs.x86_64
vdsm-gluster-4.9.6-17.el6rhs.noarch
gluster-swift-plugin-1.0-5.noarch
gluster-swift-account-1.4.8-4.el6.noarch
glusterfs-3.3.0.5rhs-40.el6rhs.x86_64
gluster-swift-doc-1.4.8-4.el6.noarch
glusterfs-server-3.3.0.5rhs-40.el6rhs.x86_64
glusterfs-rdma-3.3.0.5rhs-40.el6rhs.x86_64
gluster-swift-object-1.4.8-4.el6.noarch
gluster-swift-container-1.4.8-4.el6.noarch
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
gluster-swift-1.4.8-4.el6.noarch
gluster-swift-proxy-1.4.8-4.el6.noarch
glusterfs-geo-replication-3.3.0.5rhs-40.el6rhs.x86_64

Comment 5 Scott Haines 2013-02-06 20:06:46 UTC
Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 6 Scott Haines 2013-02-06 20:10:12 UTC
Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 7 Vivek Agarwal 2015-03-23 07:39:30 UTC
The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Comment 8 Vivek Agarwal 2015-03-23 07:40:18 UTC
The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html


Note You need to log in before you can comment on or make changes to this bug.