902646 – Quorum: sometimes doesn't start the glusterfs servers

Bug 902646 - Quorum: sometimes doesn't start the glusterfs servers

Summary: Quorum: sometimes doesn't start the glusterfs servers

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	Sachidananda Urs
Docs Contact:
URL:
Whiteboard:	glusterd
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-01-22 07:13 UTC by Sachidananda Urs
Modified:	2015-03-23 07:40 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sosreport for the failed instance (3.81 MB, application/x-xz) 2013-01-22 07:15 UTC, Sachidananda Urs	no flags	Details
View All

Description Sachidananda Urs 2013-01-22 07:13:49 UTC

Description of problem:

When the quorum kicks in, it sometimes doesn't kill the servers. And sometimes it doesn't start the killed servers. The behavior is totally random.

Mentioning below the steps to reproduce the same.

Enable quorum: gluster volume set dist cluster.server-quorum-type server
Set quorum ratio: gluster volume set all  cluster.server-quorum-ratio 80

On one of the servers, disable all network traffic except that of ssh, this is done so that we have the ssh access to the machine.

-----------------
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT
iptables -A INPUT -j DROP; iptables -A OUTPUT -j DROP
-----------------

Check the glusterfsd processes on all the machines, on some machines the glusterfsd processes are not killed.

Now restore the network settings...

iptables -F

At this point the expectation is all the killed servers are brought back up. This doesn't happen sometimes.

The following steps have to be repeated a few times to notice the behavior, this  behaviour is a bit random.

Comment 2 Sachidananda Urs 2013-01-22 07:15:31 UTC

Created attachment 684900 [details]
sosreport for the failed instance

Comment 3 Sachidananda Urs 2013-01-22 09:15:06 UTC

This is tested on update 4. And the installed rpms:


glusterfs-fuse-3.3.0.5rhs-40.el6rhs.x86_64
vdsm-gluster-4.9.6-17.el6rhs.noarch
gluster-swift-plugin-1.0-5.noarch
gluster-swift-account-1.4.8-4.el6.noarch
glusterfs-3.3.0.5rhs-40.el6rhs.x86_64
gluster-swift-doc-1.4.8-4.el6.noarch
glusterfs-server-3.3.0.5rhs-40.el6rhs.x86_64
glusterfs-rdma-3.3.0.5rhs-40.el6rhs.x86_64
gluster-swift-object-1.4.8-4.el6.noarch
gluster-swift-container-1.4.8-4.el6.noarch
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
gluster-swift-1.4.8-4.el6.noarch
gluster-swift-proxy-1.4.8-4.el6.noarch
glusterfs-geo-replication-3.3.0.5rhs-40.el6rhs.x86_64

Comment 5 Scott Haines 2013-02-06 20:06:46 UTC

Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 6 Scott Haines 2013-02-06 20:10:12 UTC

Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 7 Vivek Agarwal 2015-03-23 07:39:30 UTC

The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Comment 8 Vivek Agarwal 2015-03-23 07:40:18 UTC

The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Note You need to log in before you can comment on or make changes to this bug.