Red Hat Bugzilla – Bug 119334
Excessive recursion in SCSI layer when device offlined
Last modified: 2007-11-30 17:07:01 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1)
Description of problem:
The SCSI mid-layer blows out the kernel stack due to the
inherent recursion in its request function. The repeating
stack trace goes something like:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Queue up lots of I/O to a device.
2. Set that device offline. (Pull a drive, etc.)
Actual Results: System crashes
Expected Results: System should recover from the failure. In our
we were testing (E)MD failure recovery.
Created attachment 98938 [details]
2.4.X kernel.org patch that fixes this issue
Here is a fix I proposed on linux-scsi. It applies to RHEL3 with only
one, easily resolved, reject.
Seems "upstream" 2.4.X already handles this issue by using
__scsi_release_command inside __scsi_end_request to avoid
What kernel version are you testing against. This should be fixed
already. In the tree I'm looking at, __scsi_release_command already
I was looking at 2.4.21-EL.4. Looking at the QU2 beta, it appears
That's what I thought. Closing bug as ERRATA.
This problem was fixed in RHEL3 U1.