Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 463416

Summary: RHEL 5.3: fix scsi regression causing udev to hang loading sr_mod
Product: Red Hat Enterprise Linux 5 Reporter: Mark McLoughlin <markmc>
Component: kernelAssignee: Mike Christie <mchristi>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.3CC: dzickus, ehabkost, jmoyer, jturner, lwang, syeghiay
Target Milestone: betaKeywords: TestBlocker
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:16:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
'lspci -v' output
none
2.6.18.4-116.el5-udev-hang.long
none
scsi-fix-regression-introduced-by-typo-in-failfast.patch
none
scsi-fix-regression-introduced-by-typo-in-failfast.patch none

Description Mark McLoughlin 2008-09-23 09:56:09 UTC
With 2.6.18.4-116, udev hangs when loading sr_mod on my machine.

I bisected the issue down to:

http://git.engineering.redhat.com/?p=users/dzickus/rhel5/kernel;a=commit;h=325d5462da6613a1353fa8cbc4603e8f056e67b1

  commit 325d5462da6613a1353fa8cbc4603e8f056e67b1
  [scsi] modify failfast so it does not always fail fast

Attaching 'lspci -v' output and a log from booting with 'udevdebug'

The log shows that 'modprobe sr_mod' is the first command to hang; also, at the end it shows:

  sr 0:0:1:0: timing out command, waited 120s
  sr 0:0:1:0: timing out command, waited 120s
  sr 0:0:1:0: timing out command, waited 120s
  sr 0:0:1:0: timing out command, waited 120s

The issue seems to be caused by a fairly simple typo; attaching a patch below

Comment 1 Mark McLoughlin 2008-09-23 09:57:02 UTC
Created attachment 317454 [details]
'lspci -v' output

Comment 2 Mark McLoughlin 2008-09-23 09:57:41 UTC
Created attachment 317456 [details]
2.6.18.4-116.el5-udev-hang.long

Comment 3 Mark McLoughlin 2008-09-23 10:00:36 UTC
Created attachment 317458 [details]
scsi-fix-regression-introduced-by-typo-in-failfast.patch

Comment 4 Mark McLoughlin 2008-09-23 10:03:21 UTC
Created attachment 317459 [details]
scsi-fix-regression-introduced-by-typo-in-failfast.patch

Wow - emacs really screwed that up ...

Comment 7 Jeff Moyer 2008-09-24 19:18:28 UTC
This is keeping our performance team from running their standard battery of tests.

Comment 9 Jay Turner 2008-09-30 11:54:21 UTC
Looks like this patch actually made it into -117.el5 and I'm not longer seeing the hang.  Will continue to test.

Comment 10 Don Zickus 2008-09-30 16:01:58 UTC
in kernel-2.6.18-117.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 13 errata-xmlrpc 2009-01-20 20:16:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html