Bug 75669 - SG queue function getting null pointer
Summary: SG queue function getting null pointer
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel (Show other bugs)
(Show other bugs)
Version: 2.1
Hardware: i686 Linux
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact: Brian Brock
: 103685 (view as bug list)
Depends On:
Blocks: 87937
TreeView+ depends on / blocked
Reported: 2002-10-10 21:44 UTC by Heather Conway
Modified: 2007-11-30 22:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2003-12-19 19:25:56 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2003:408 normal SHIPPED_LIVE Important: Updated kernel packages address security vulnerabilities, bugfixes 2003-12-19 05:00:00 UTC

Description Heather Conway 2002-10-10 21:44:06 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)

Description of problem:
Sometimes when the memory from the kernels dynamic heap is freed, and then re-
allocated, and over-written between the time an sg interface i/o is queued and 
the queuing function completing the queuing function will get a null pointer. 
This causes the system to panic. 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
When working with our mutli-pathing product, allowing all I/O types to use 
passive paths automatically increased the number of SG interface I/Os 
initiated.  This increase caused path failures (reads/writes to NOT_READY 
devices) and ultimately panics.  

Expected Results:  A proposed patch to fix the problem is included below.  This 
was taken from the vanilla kernel.org v2.4.17 kernel.

Additional info:

--- sg.c	Fri May  3 16:06:49 2002
+++ sg.c.FIXED	Mon Oct  7 16:17:08 2002
@@ -645,6 +645,7 @@
     Scsi_Request        * SRpnt;
     Sg_device           * sdp = sfp->parentdp;
     sg_io_hdr_t         * hp = &srp->header;
+    request_queue_t	* q;
     srp->data.cmd_opcode = cmnd[0];  /* hold opcode of command */
     hp->status = 0;
@@ -680,6 +681,7 @@
     srp->my_cmdp = SRpnt;
+    q = &SRpnt->sr_device->request_queue;
     SRpnt->sr_request.rq_dev = sdp->i_rdev;
     SRpnt->sr_request.rq_status = RQ_ACTIVE;
     SRpnt->sr_sense_buffer[0] = 0;
@@ -715,7 +717,8 @@
 		(void *)SRpnt->sr_buffer, hp->dxfer_len,
 		sg_cmd_done_bh, timeout, SG_DEFAULT_RETRIES);
     /* dxfer_len overwrites SRpnt->sr_bufflen, hence need for b_malloc_len */
-    generic_unplug_device(&SRpnt->sr_device->request_queue);
+//    generic_unplug_device(&SRpnt->sr_device->request_queue);
+    generic_unplug_device(q);
     return 0;

Comment 1 Heather Conway 2002-11-27 16:09:45 UTC
Is there any update as to whether this fix will be included in an upcoming 

Comment 2 Heather Conway 2003-01-14 18:58:40 UTC
Will this fix be included in the e.11 errata?

Comment 3 Heather Conway 2003-02-08 18:56:24 UTC
Is there still time to get this included into the e.11 errata?  

Comment 4 Matt Domsch 2003-04-28 15:53:37 UTC
This is not fixed in 2.4.9-e.18, thus will not make Q2 quarterly update.

Comment 6 James Bottomley 2003-06-08 13:58:44 UTC
This bug has gone from being a rarely tripped curiosity to being a showstopper
for us in -e.16.  We cannot now get our test harness to run without tripping it.
 I think the prominence has risen because of some of the threading/scheduling
changes making it much more likely that the SRpnt will have been re-used before
the scsi_do_req returns.

The fix listed in this MR is obviously correct and was supplied by Doug Gilbert
to fix this very problem, could you please just apply it

Comment 7 Tom Coughlan 2003-06-09 20:42:00 UTC
The fix is checked in.  It is planned to ship in our next errata.  

Comment 8 Jeff Needle 2003-10-17 19:01:13 UTC
*** Bug 103685 has been marked as a duplicate of this bug. ***

Comment 9 John Flanagan 2003-12-19 19:25:56 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.