Bug 77565

Summary: [gdth] NULL pointer dereference in scsi.c (scsi_release_commandblocks)
Product: [Retired] Red Hat Linux Reporter: Need Real Name <dtoman>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-11-16 11:07:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Need Real Name 2002-11-09 10:47:20 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823
Netscape/7.0

Description of problem:
There is uninitialised request_queue.queue_lock member of the Scsi_device
structure, which causes that the scsi_release_commandblocks crashes when trying
to access the lock.
The scsi_release_command_blocks is called from scsi_free_host_dev() which calls
blk_cleanup_queue() at first - which fills the pointer to the lock with zeros
(the blk_cleanup_queue() should be moved after the scsi_release_commandblocks()
I think - at least it seems it work for me).
The problem is similar to the one (alredy solved in kernel-2.4.18-17.8.0) with
scsi_build_commandblocks() called from scsi_get_host_dev() where the
initialization of the Scsi_Device structure were moved before the
scsi_build_commandblocks to avoid problems with NULL pointer to the lock. 

Note: in my case the problem didn't affect normal SCSI disk work. All seems to
work fine expect the raid array management (see below)

Version-Release number of selected component (if applicable):
default RH8 and RH7.3 kernel tested (both ooops in scsi_build_commandblocks and
scsi_release_commandblocks).

How reproducible:
Always

Steps to Reproduce:
1. install SRCU32 Intel raid controller (we are using it in SHG2 Intel
motherpoard with 2 Xeon 2.4Ghz CPU).
2. run SMP kernel (tested the default ones from RH 8.0 and RH7.3). Uniprocessor
kernel is not affected by the problem (I don't know why - perhaps the locking is
different?)
3. run intel storage control utility (storcon - shipped with SRCU32 on a CD).
After you select the way you want to connect (local/remote) the kernel ooops on
default RH kernels (the scsi_build_commandblocks crashes). On the upgraded
kernel problem occurs when the blocks are being released.

Note: anytime the SMP kernel starts it writes a NULL dereference problem into
message logs (just before the login promt is displayed) and the same problem
occurs when the kernel is shutting down (when deinitialising the 'md' devices)
	
The used driver (gdth) does work fins - it is possible access raid disks without
problems.

Additional info:

Comment 1 Arjan van de Ven 2002-11-09 10:49:24 UTC
this is supposed to be fixed in the testkernel at
http://people.redhat.com/arjanv/testkernels/

Comment 2 Arjan van de Ven 2002-11-16 11:07:03 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2002-262.html