Bug 1048042 - gluster commands fail as glusterd doesn't detect network failure, when all outbound traffic to other glusterd's are dropped
Summary: gluster commands fail as glusterd doesn't detect network failure, when all ou...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: krishnan parthasarathi
QA Contact: SATHEESARAN
URL:
Whiteboard: glusterd
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-03 01:29 UTC by SATHEESARAN
Modified: 2015-08-03 13:49 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-03 13:49:30 UTC
Embargoed:


Attachments (Terms of Use)
sosreports (11.19 MB, application/gzip)
2014-01-03 01:52 UTC, SATHEESARAN
no flags Details

Description SATHEESARAN 2014-01-03 01:29:13 UTC
Description of problem:
-----------------------
gluster cli commands fail with,'Connection failed. Please check if gluster daemon is operational', as glusterd doesn't detect network failure, when all outbound network traffic from that glusterd to all other glusterd were dropped

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs-3.4.0.51geo.el6rhs [ hotfix for RHSS 2.1 Update1 ]

How reproducible:
-----------------
Happened all the times

Steps to Reproduce:
-------------------
1. Create a trusted storage pool of 'N' RHSS Nodes
2. In one of the RHSS Node, drop all outbound glusterd traffic
3. Execute any 'gluster cli' command

Actual results:
---------------
All gluster cli commands fail with,'Connection failed. Please check if gluster daemon is operational', though glusterd is up and kicking

Expected results:
-----------------
1. glusterd should detect the above scenario as network disconnect with the help of heart-beating mechanism

2. gluster cli commands should not fail with,'Connection failed. Please check if gluster daemon is operational', though glusterd is up and kicking

Additional info:
----------------

SETUP INFO
===========
1. RHSS 2.1 U1 ISO - RHSS-2.1-20131122.0-RHS-x86_64-DVD1.iso
2. Packages - 
   glusterfs-3.4.0.51geo.el6rhs - http://download.devel.redhat.com/brewroot/packages/glusterfs/3.4.0.51geo/1.el6rhs/x86_64/
3. Provisioning - 2 RHSS VMs through Beaker
4. 2 Node cluster was created
(i.e) gluster peer probe <RHSS-NODE>
5. No volumes were created

STEPS
=====
1. Stop all outbound glusterd traffic from one of the RHSS Node using iptables
(i.e) iptables -I OUTPUT -p tcp --dport 24007 -j DROP

Above command drops all output packets destined for 24007 ( all glusterd listens )

2. Execute any gluster cli command from the same node
(.e) gluster volume status

CONSOLE LOGS
=============
[Thu Jan  2 23:00:33 UTC 2014 root.37.7:~ ] # iptables -I OUTPUT -p tcp --dport 24007 -j DROP

[Thu Jan  2 23:00:40 UTC 2014 root.37.7:~ ] # iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DROP       tcp  --  anywhere             anywhere            tcp dpt:24007 

[Thu Jan  2 23:01:12 UTC 2014 root.37.7:~ ] # gluster peer status
Connection failed. Please check if gluster daemon is operational.

[Thu Jan  2 23:03:14 UTC 2014 root.37.7:~ ] # service glusterd status
glusterd (pid  6797) is running...

[Thu Jan  2 23:04:34 UTC 2014 root.37.7:~ ] # gluster volume status
Connection failed. Please check if gluster daemon is operational.

Comment 1 SATHEESARAN 2014-01-03 01:52:00 UTC
Created attachment 844788 [details]
sosreports

sosreports from 2 RHSS Nodes

Comment 2 Vivek Agarwal 2014-02-20 08:36:30 UTC
adding 3.0 flag and removing 2.1.z

Comment 5 SATHEESARAN 2015-08-03 13:49:30 UTC
This bug is no longer reproducible and closing this bug.


Note You need to log in before you can comment on or make changes to this bug.