Red Hat Bugzilla – Bug 75669
SG queue function getting null pointer
Last modified: 2007-11-30 17:06:52 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Description of problem:
Sometimes when the memory from the kernels dynamic heap is freed, and then re-
allocated, and over-written between the time an sg interface i/o is queued and
the queuing function completing the queuing function will get a null pointer.
This causes the system to panic.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
When working with our mutli-pathing product, allowing all I/O types to use
passive paths automatically increased the number of SG interface I/Os
initiated. This increase caused path failures (reads/writes to NOT_READY
devices) and ultimately panics.
Expected Results: A proposed patch to fix the problem is included below. This
was taken from the vanilla kernel.org v2.4.17 kernel.
--- sg.c Fri May 3 16:06:49 2002
+++ sg.c.FIXED Mon Oct 7 16:17:08 2002
@@ -645,6 +645,7 @@
Scsi_Request * SRpnt;
Sg_device * sdp = sfp->parentdp;
sg_io_hdr_t * hp = &srp->header;
+ request_queue_t * q;
srp->data.cmd_opcode = cmnd; /* hold opcode of command */
hp->status = 0;
@@ -680,6 +681,7 @@
srp->my_cmdp = SRpnt;
+ q = &SRpnt->sr_device->request_queue;
SRpnt->sr_request.rq_dev = sdp->i_rdev;
SRpnt->sr_request.rq_status = RQ_ACTIVE;
SRpnt->sr_sense_buffer = 0;
@@ -715,7 +717,8 @@
(void *)SRpnt->sr_buffer, hp->dxfer_len,
sg_cmd_done_bh, timeout, SG_DEFAULT_RETRIES);
/* dxfer_len overwrites SRpnt->sr_bufflen, hence need for b_malloc_len */
Is there any update as to whether this fix will be included in an upcoming
Will this fix be included in the e.11 errata?
Is there still time to get this included into the e.11 errata?
This is not fixed in 2.4.9-e.18, thus will not make Q2 quarterly update.
This bug has gone from being a rarely tripped curiosity to being a showstopper
for us in -e.16. We cannot now get our test harness to run without tripping it.
I think the prominence has risen because of some of the threading/scheduling
changes making it much more likely that the SRpnt will have been re-used before
the scsi_do_req returns.
The fix listed in this MR is obviously correct and was supplied by Doug Gilbert
to fix this very problem, could you please just apply it
The fix is checked in. It is planned to ship in our next errata.
*** Bug 103685 has been marked as a duplicate of this bug. ***
An errata has been issued which should help the problem described in this bug report.
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen
this bug report if the solution does not work for you.