Bug 902646

Summary: Quorum: sometimes doesn't start the glusterfs servers
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sachidananda Urs <sac>
Component: glusterdAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WONTFIX QA Contact: Sachidananda Urs <surs>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 2.0CC: rhs-bugs, rwheeler, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: glusterd
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sosreport for the failed instance none

Description Sachidananda Urs 2013-01-22 07:13:49 UTC
Description of problem:

When the quorum kicks in, it sometimes doesn't kill the servers. And sometimes it doesn't start the killed servers. The behavior is totally random.

Mentioning below the steps to reproduce the same.

Enable quorum: gluster volume set dist cluster.server-quorum-type server
Set quorum ratio: gluster volume set all  cluster.server-quorum-ratio 80

On one of the servers, disable all network traffic except that of ssh, this is done so that we have the ssh access to the machine.

-----------------
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT
iptables -A INPUT -j DROP; iptables -A OUTPUT -j DROP
-----------------

Check the glusterfsd processes on all the machines, on some machines the glusterfsd processes are not killed.

Now restore the network settings...

iptables -F

At this point the expectation is all the killed servers are brought back up. This doesn't happen sometimes.

The following steps have to be repeated a few times to notice the behavior, this  behaviour is a bit random.

Comment 2 Sachidananda Urs 2013-01-22 07:15:31 UTC
Created attachment 684900 [details]
sosreport for the failed instance

Comment 3 Sachidananda Urs 2013-01-22 09:15:06 UTC
This is tested on update 4. And the installed rpms:


glusterfs-fuse-3.3.0.5rhs-40.el6rhs.x86_64
vdsm-gluster-4.9.6-17.el6rhs.noarch
gluster-swift-plugin-1.0-5.noarch
gluster-swift-account-1.4.8-4.el6.noarch
glusterfs-3.3.0.5rhs-40.el6rhs.x86_64
gluster-swift-doc-1.4.8-4.el6.noarch
glusterfs-server-3.3.0.5rhs-40.el6rhs.x86_64
glusterfs-rdma-3.3.0.5rhs-40.el6rhs.x86_64
gluster-swift-object-1.4.8-4.el6.noarch
gluster-swift-container-1.4.8-4.el6.noarch
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
gluster-swift-1.4.8-4.el6.noarch
gluster-swift-proxy-1.4.8-4.el6.noarch
glusterfs-geo-replication-3.3.0.5rhs-40.el6rhs.x86_64

Comment 5 Scott Haines 2013-02-06 20:06:46 UTC
Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 6 Scott Haines 2013-02-06 20:10:12 UTC
Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 7 Vivek Agarwal 2015-03-23 07:39:30 UTC
The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Comment 8 Vivek Agarwal 2015-03-23 07:40:18 UTC
The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html