Bug 836448 - CTDB: Noticeable delay in CTDB failover(ranging between 6-14 mins)
Summary: CTDB: Noticeable delay in CTDB failover(ranging between 6-14 mins)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: unclassified
Version: 3.3.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact: amainkar
URL:
Whiteboard:
Depends On:
Blocks: 840655 840813
TreeView+ depends on / blocked
 
Reported: 2012-06-29 06:40 UTC by Rachana Patel
Modified: 2018-11-30 20:32 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 840655 (view as bug list)
Environment:
Last Closed: 2013-07-24 17:14:13 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2012-06-29 06:40:47 UTC
Description of problem:
CTDB: Noticeable delay in CTDB failover(ranging between 6-14 mins)

Version-Release number of selected component (if applicable):
CTDB version: 1.0.114.5-1.el6
glusterfs 3.3.0

How reproducible:
Often

Steps to Reproduce:
1.Configure Automated IP Fail-over for NFS and CIFS as per the Doc RHS 2.0 Administration Guide -(section -Configuring Automated IP Fail-over for NFS and CIFS)
2. now create Volumes and (NFS or CIFS)mount them using VIPs.
3. create few dir and files on mounted volume.
4. check the node status using 'ctdb status' and check ip mapping using 'ctdb ip'
5. Power off any one storage server who is part of that CTDB failover configuration.
6. Check the ctdb status from storage server using 'ctdb status' 

  
Actual results:
when one server is brought down, many times delay has been found in CTDB 
failover. ranging between 6-14 mins. For significantly large delays (~40mins, observed only once), CTDB's 'cluster membership' algorithm bans the node failing to 'recover'

Expected results:
Time taken for failover should not be in minutes.

Additional info:
--'ls' on /gluster/lock takes time 'proportional' to the delay noticed in 'failover'
-- It takes time in identifying that one node/storage server is down.

Comment 1 krishnan parthasarathi 2012-07-02 04:57:23 UTC
This issue is seen due to a bug in the ping timer logic. The purpose of the ping timer is to assert the absence of any evidence the server is possibly alive.

The current implementation updates the 'last_sent' timer in the following
points in time,
- rpc_clnt_submit: when rpc messages are being queued at the transport
layer. (wrong!, since we have no way to determine if server actually
received the message)
- rpc_clnt_notify: when client receives pollout event on sending a message
on the 'wire'. (correct, since it indicates ACK from server)

The fix is to remove the 'incorrect' update of 'last_sent'. The fix is already pushed into the master branch, http://review.gluster.com/3625.

Comment 2 Amar Tumballi 2012-07-11 06:06:19 UTC
patch accepted in master.

Comment 3 Ujjwala 2012-08-28 07:36:30 UTC
Verified it on RHS 2.0.z update 2 and the ctdb failover time varies between 70 to 90 seconds.
Tested it for Distribute, Replicate and Distributed-Replicate volumes on both cifs and nfs mounts.

# glusterfs -V
glusterfs 3.3.0rhs built on Aug 17 2012 07:06:58
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.

Comment 4 Ujjwala 2012-08-28 08:32:24 UTC
The bug was verified only on RHS 2.0.z and not on master. So moving it back to modified.


Note You need to log in before you can comment on or make changes to this bug.