Escalated to Bugzilla from IssueTracker
A fix for this problem has just been committed to the RHEL3 U7 patch pool this evening (in kernel version 2.4.21-37.4.EL).
Please note that the default behavior of the patched kernel is unchanged. (That is, kernel.printk_ratelimit=0). In order for the customer to avoid the printk-delay problem, the customer must enable printk rate limiting by setting: sysctl -w kernel.printk_ratelimit=5 The customer may want to consider using /etc/sysctl.conf Tom
This comment is for IT 71441. When an I/O is issued on a broken path it will eventually timeout and produce a fatal I/O error, unless there is some sort of transparent failover to another path. Is this system set up for transparent multipath failover? from Veritas? If so, there is not too much we can do to help debug it. It is possible that the multipath software is producing log messages that also need to be rate limited. One other possibility is that the printk's that we rate limited need to be limited even more, to avoid messing up whatever multipath solution they are trying to use. There are two parameters: printk_ratelimit: the minimum length of time in seconds between messages that have been designated as rate limited. Default is 0 on RHEL 3. printk_ratelimit_burst: the number of rate-limited messages that will be allowed to print before rate limiting kicks in. Default is 10. They could try something like printk_ratelimit=10 and printk_ratelimit_burst=2, to cut down even more on any system delays while the cable is pulled.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0144.html