Bug 463416 - RHEL 5.3: fix scsi regression causing udev to hang loading sr_mod
RHEL 5.3: fix scsi regression causing udev to hang loading sr_mod
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
All Linux
medium Severity medium
: beta
: ---
Assigned To: Mike Christie
Martin Jenner
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-23 05:56 EDT by Mark McLoughlin
Modified: 2009-01-20 15:16 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:16:30 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
'lspci -v' output (7.48 KB, text/plain)
2008-09-23 05:57 EDT, Mark McLoughlin
no flags Details
2.6.18.4-116.el5-udev-hang.long (501.57 KB, text/plain)
2008-09-23 05:57 EDT, Mark McLoughlin
no flags Details
scsi-fix-regression-introduced-by-typo-in-failfast.patch (1.26 KB, patch)
2008-09-23 06:00 EDT, Mark McLoughlin
no flags Details | Diff
scsi-fix-regression-introduced-by-typo-in-failfast.patch (1.33 KB, patch)
2008-09-23 06:03 EDT, Mark McLoughlin
no flags Details | Diff

  None (edit)
Description Mark McLoughlin 2008-09-23 05:56:09 EDT
With 2.6.18.4-116, udev hangs when loading sr_mod on my machine.

I bisected the issue down to:

http://git.engineering.redhat.com/?p=users/dzickus/rhel5/kernel;a=commit;h=325d5462da6613a1353fa8cbc4603e8f056e67b1

  commit 325d5462da6613a1353fa8cbc4603e8f056e67b1
  [scsi] modify failfast so it does not always fail fast

Attaching 'lspci -v' output and a log from booting with 'udevdebug'

The log shows that 'modprobe sr_mod' is the first command to hang; also, at the end it shows:

  sr 0:0:1:0: timing out command, waited 120s
  sr 0:0:1:0: timing out command, waited 120s
  sr 0:0:1:0: timing out command, waited 120s
  sr 0:0:1:0: timing out command, waited 120s

The issue seems to be caused by a fairly simple typo; attaching a patch below
Comment 1 Mark McLoughlin 2008-09-23 05:57:02 EDT
Created attachment 317454 [details]
'lspci -v' output
Comment 2 Mark McLoughlin 2008-09-23 05:57:41 EDT
Created attachment 317456 [details]
2.6.18.4-116.el5-udev-hang.long
Comment 3 Mark McLoughlin 2008-09-23 06:00:36 EDT
Created attachment 317458 [details]
scsi-fix-regression-introduced-by-typo-in-failfast.patch
Comment 4 Mark McLoughlin 2008-09-23 06:03:21 EDT
Created attachment 317459 [details]
scsi-fix-regression-introduced-by-typo-in-failfast.patch

Wow - emacs really screwed that up ...
Comment 7 Jeff Moyer 2008-09-24 15:18:28 EDT
This is keeping our performance team from running their standard battery of tests.
Comment 9 Jay Turner 2008-09-30 07:54:21 EDT
Looks like this patch actually made it into -117.el5 and I'm not longer seeing the hang.  Will continue to test.
Comment 10 Don Zickus 2008-09-30 12:01:58 EDT
in kernel-2.6.18-117.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 13 errata-xmlrpc 2009-01-20 15:16:30 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.