163184 – Explain why the SCSI inquiry is not being returned from the sd for nearly 5 minutes

Bug 163184 - Explain why the SCSI inquiry is not being returned from the sd for nearly 5 minutes

Summary: Explain why the SCSI inquiry is not being returned from the sd for nearly 5 m...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Tom Coughlan
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	168424
TreeView+	depends on / blocked

Reported:	2005-07-13 20:07 UTC by Issue Tracker
Modified:	2007-11-30 22:07 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHSA-2006-0144
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-03-15 16:14:12 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2006:0144	0	qe-ready	SHIPPED_LIVE	Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 7	2006-03-15 05:00:00 UTC

Description Issue Tracker 2005-07-13 20:07:35 UTC

Escalated to Bugzilla from IssueTracker

Comment 30 Ernie Petrides 2005-09-30 06:52:30 UTC

A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.4.EL).

Comment 31 Tom Coughlan 2005-10-03 16:43:04 UTC

Please note that the default behavior of the patched kernel is unchanged. (That
is, kernel.printk_ratelimit=0).

In order for the customer to avoid the printk-delay problem, the customer must
enable printk rate limiting by setting:

sysctl -w kernel.printk_ratelimit=5

The customer may want to consider using /etc/sysctl.conf

Tom

Comment 32 Tom Coughlan 2005-10-28 13:48:55 UTC

This comment is for IT 71441. 

When an I/O is issued on a broken path it will eventually timeout and produce a
fatal I/O error, unless there is some sort of transparent failover to another
path. Is this system set up for transparent multipath failover? from Veritas? 
If so, there is not too much we can do to help debug it. It is possible that the
multipath software is producing log messages that also need to be rate limited. 

One other possibility is that the printk's that we rate limited need to be
limited even more, to avoid messing up whatever multipath solution they are
trying to use.

There are two parameters:

printk_ratelimit: the minimum length of time in seconds between messages that
have been designated as rate limited. Default is 0 on RHEL 3. 

printk_ratelimit_burst: the number of rate-limited messages that will be allowed
to print before rate limiting kicks in. Default is 10. 

They could try something like printk_ratelimit=10 and printk_ratelimit_burst=2,
to cut down even more on any system delays while the cable is pulled.

Comment 38 Red Hat Bugzilla 2006-03-15 16:14:13 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html

Note You need to log in before you can comment on or make changes to this bug.