This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 75669 - SG queue function getting null pointer
SG queue function getting null pointer
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel (Show other bugs)
2.1
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Tom Coughlan
Brian Brock
:
: 103685 (view as bug list)
Depends On:
Blocks: 87937
  Show dependency treegraph
 
Reported: 2002-10-10 17:44 EDT by Heather Conway
Modified: 2007-11-30 17:06 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-12-19 14:25:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Heather Conway 2002-10-10 17:44:06 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)

Description of problem:
Sometimes when the memory from the kernels dynamic heap is freed, and then re-
allocated, and over-written between the time an sg interface i/o is queued and 
the queuing function completing the queuing function will get a null pointer. 
This causes the system to panic. 

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
When working with our mutli-pathing product, allowing all I/O types to use 
passive paths automatically increased the number of SG interface I/Os 
initiated.  This increase caused path failures (reads/writes to NOT_READY 
devices) and ultimately panics.  

Expected Results:  A proposed patch to fix the problem is included below.  This 
was taken from the vanilla kernel.org v2.4.17 kernel.

Additional info:

--- sg.c	Fri May  3 16:06:49 2002
+++ sg.c.FIXED	Mon Oct  7 16:17:08 2002
@@ -645,6 +645,7 @@
     Scsi_Request        * SRpnt;
     Sg_device           * sdp = sfp->parentdp;
     sg_io_hdr_t         * hp = &srp->header;
+    request_queue_t	* q;
 
     srp->data.cmd_opcode = cmnd[0];  /* hold opcode of command */
     hp->status = 0;
@@ -680,6 +681,7 @@
     }
 
     srp->my_cmdp = SRpnt;
+    q = &SRpnt->sr_device->request_queue;
     SRpnt->sr_request.rq_dev = sdp->i_rdev;
     SRpnt->sr_request.rq_status = RQ_ACTIVE;
     SRpnt->sr_sense_buffer[0] = 0;
@@ -715,7 +717,8 @@
 		(void *)SRpnt->sr_buffer, hp->dxfer_len,
 		sg_cmd_done_bh, timeout, SG_DEFAULT_RETRIES);
     /* dxfer_len overwrites SRpnt->sr_bufflen, hence need for b_malloc_len */
-    generic_unplug_device(&SRpnt->sr_device->request_queue);
+//    generic_unplug_device(&SRpnt->sr_device->request_queue);
+    generic_unplug_device(q);
     return 0;
 }
Comment 1 Heather Conway 2002-11-27 11:09:45 EST
Is there any update as to whether this fix will be included in an upcoming 
errata?
Thanks.
Comment 2 Heather Conway 2003-01-14 13:58:40 EST
Will this fix be included in the e.11 errata?
Thanks.
Comment 3 Heather Conway 2003-02-08 13:56:24 EST
Is there still time to get this included into the e.11 errata?  
Comment 4 Matt Domsch 2003-04-28 11:53:37 EDT
This is not fixed in 2.4.9-e.18, thus will not make Q2 quarterly update.
Comment 6 James Bottomley 2003-06-08 09:58:44 EDT
This bug has gone from being a rarely tripped curiosity to being a showstopper
for us in -e.16.  We cannot now get our test harness to run without tripping it.
 I think the prominence has risen because of some of the threading/scheduling
changes making it much more likely that the SRpnt will have been re-used before
the scsi_do_req returns.

The fix listed in this MR is obviously correct and was supplied by Doug Gilbert
to fix this very problem, could you please just apply it
Comment 7 Tom Coughlan 2003-06-09 16:42:00 EDT
The fix is checked in.  It is planned to ship in our next errata.  
Comment 8 Jeff Needle 2003-10-17 15:01:13 EDT
*** Bug 103685 has been marked as a duplicate of this bug. ***
Comment 9 John Flanagan 2003-12-19 14:25:56 EST
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2003-408.html

Note You need to log in before you can comment on or make changes to this bug.