From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030618 Description of problem: The SCSI mid-layer blows out the kernel stack due to the inherent recursion in its request function. The repeating stack trace goes something like: scsi_request_fn __scsi_end_request scsi_release_command scsi_queue_next_request scsi_request_fn ... Version-Release number of selected component (if applicable): All How reproducible: Always Steps to Reproduce: 1. Queue up lots of I/O to a device. 2. Set that device offline. (Pull a drive, etc.) Actual Results: System crashes Expected Results: System should recover from the failure. In our particular case, we were testing (E)MD failure recovery. Additional info:
Created attachment 98938 [details] 2.4.X kernel.org patch that fixes this issue Here is a fix I proposed on linux-scsi. It applies to RHEL3 with only one, easily resolved, reject.
Seems "upstream" 2.4.X already handles this issue by using __scsi_release_command inside __scsi_end_request to avoid the recursion.
What kernel version are you testing against. This should be fixed already. In the tree I'm looking at, __scsi_release_command already uses __scsi_end_request.
I was looking at 2.4.21-EL.4. Looking at the QU2 beta, it appears fixed there.
That's what I thought. Closing bug as ERRATA.
This problem was fixed in RHEL3 U1.