Bug 119334 - Excessive recursion in SCSI layer when device offlined
Summary: Excessive recursion in SCSI layer when device offlined
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-03-29 17:09 UTC by Justin T. Gibbs
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-03-31 15:48:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
2.4.X kernel.org patch that fixes this issue (1.55 KB, patch)
2004-03-29 17:10 UTC, Justin T. Gibbs
no flags Details | Diff

Description Justin T. Gibbs 2004-03-29 17:09:05 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1)
Gecko/20030618

Description of problem:

The SCSI mid-layer blows out the kernel stack due to the
inherent recursion in its request function.  The repeating
stack trace goes something like:

scsi_request_fn
__scsi_end_request
scsi_release_command
scsi_queue_next_request
scsi_request_fn
...

Version-Release number of selected component (if applicable):
All

How reproducible:
Always

Steps to Reproduce:
1. Queue up lots of I/O to a device.
2. Set that device offline.  (Pull a drive, etc.)
    

Actual Results:  System crashes

Expected Results:  System should recover from the failure.  In our
particular case,
we were testing (E)MD failure recovery.

Additional info:

Comment 1 Justin T. Gibbs 2004-03-29 17:10:50 UTC
Created attachment 98938 [details]
2.4.X kernel.org patch that fixes this issue

Here is a fix I proposed on linux-scsi.  It applies to RHEL3 with only
one, easily resolved, reject.

Comment 2 Justin T. Gibbs 2004-03-29 17:38:07 UTC
Seems "upstream" 2.4.X already handles this issue by using
__scsi_release_command inside __scsi_end_request to avoid
the recursion.

Comment 3 Doug Ledford 2004-03-30 20:52:20 UTC
What kernel version are you testing against.  This should be fixed
already.  In the tree I'm looking at, __scsi_release_command already
uses __scsi_end_request.

Comment 4 Justin T. Gibbs 2004-03-30 21:39:41 UTC
I was looking at 2.4.21-EL.4.  Looking at the QU2 beta, it appears
fixed there.

Comment 5 Doug Ledford 2004-03-31 15:48:12 UTC
That's what I thought.  Closing bug as ERRATA.

Comment 6 Ernie Petrides 2005-10-06 03:13:31 UTC
This problem was fixed in RHEL3 U1.


Note You need to log in before you can comment on or make changes to this bug.