235092 – clvmd operations timeout under heavy load == bad for recovery scenarios

Bug 235092 - clvmd operations timeout under heavy load == bad for recovery scenarios

Summary: clvmd operations timeout under heavy load == bad for recovery scenarios

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	lvm2-cluster
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Milan Broz
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-04-03 18:51 UTC by Jonathan Earl Brassow
Modified:	2013-03-01 04:05 UTC (History)
CC List:	7 users (show)
Fixed In Version:	RHBA-2007-0046
Clone Of:
Environment:
Last Closed:	2007-05-10 21:10:01 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2007:0046	0	normal	SHIPPED_LIVE	lvm2-cluster bug fix update	2007-05-10 21:06:38 UTC

Description Jonathan Earl Brassow 2007-04-03 18:51:02 UTC

When a mirror device fails under heavy load, it can take a very long time
(minutes) for each CLVM command to process.  This can lead to clvmd time-outs
being triggered and the mirror fault handling code to abort.

The root of the problem is the need for remote nodes to have to scan the devices
when doing activates/deactivates.  Those scans get queued up behind all the
other I/O that is happening and simply take a long time.

Either we need to find a completely different way to detect stalled machines, or
we need to raise the clvmd timeout (e.g. clvmd -t 100).  I'm advocating the
later for now and the former when we have more time to investigate.

Comment 1 Jonathan Earl Brassow 2007-04-03 18:52:12 UTC

To be clear about my request:

Let's increase the clvmd timeout.

Comment 2 Jonathan Earl Brassow 2007-04-16 20:49:05 UTC

Index: LVM2/scripts/clvmd_init_rhel4
===================================================================
--- LVM2.orig/scripts/clvmd_init_rhel4
+++ LVM2/scripts/clvmd_init_rhel4
@@ -15,7 +15,7 @@ VGCHANGE="/usr/sbin/vgchange"
 VGSCAN="/usr/sbin/vgscan"
 VGDISPLAY="/usr/sbin/vgdisplay"
 VGS="/usr/sbin/vgs"
-CLVMDOPTS="-T20"
+CLVMDOPTS="-T20 -t 90"

 [ -f /etc/sysconfig/cluster ] && . /etc/sysconfig/cluster

Comment 8 Red Hat Bugzilla 2007-05-10 21:10:01 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0046.html

Note You need to log in before you can comment on or make changes to this bug.