Bug 152336

Summary: Cannot build a low-level SCSI driver fro x86_64 SMP
Product: Red Hat Enterprise Linux 4 Reporter: Jack Hammer <jack_hammer>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: jbaron, pfrields, riel
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-03-30 14:33:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jack Hammer 2005-03-28 17:24:38 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050225 Firefox/1.0.1

Description of problem:
If you build the kernel from the kernel sources on the SRPM CD's and then only replace a low-level SCSI driver, the system will not boot.  This ONLY occurs with SMP and ONLY if you have manually partitioned your drives ( the same driver does not fail on SMP if you took the install defaults and use LVM ).

So far, I have seen this behaviour with two different SCSI drivers, AACRAID and IPS


Version-Release number of selected component (if applicable):
kernel-2.6.9-5.ELsmp

How reproducible:
Always

Steps to Reproduce:
1.Manually partion ( multiple disks ) when installing RHEL 4
2.Rebuild the kernel, copy the SMP SCSI driver to the appropriate /lib/modules/2.6.9-5.ELsmp/kernel/drivers/scsi/.  dir and run mkinitrd
3.Linux cannot find the second disk when rebooted - reboot uniprocesser kernel to restore working driver
  

Actual Results:  The driver loads cleanly.

When the low-level SCSI driver reported a DASDI ( the second hard drive ) for Channel 0 Target 1 LUN 0 , the SCSI layer prints out that it is Channel 1 Target 0 LUN 0   - then of course, it can't find it.

Again, if the system is installed with LVM and not manually partitioned, this same driver works every time.  A driver for the uni-processor kernel ( built the same way ) also works every time in any configuration.

Expected Results:  The driver should not be aware of LVM or not and I would expect the SCSI layer to behave the same in either case.

Additional info:

I originally discovered this using IPS driver from the Adaptec build system, but with further investigation discovered that the same thing occurs with a "native" kernel building of the driver on an x86_64 system.

A similar issue has been reported by a customer using the AACRAID driver.

This currently blocks IBM/ADAPTEC's ability to support RHEL 4 in new releases of ServeRAID as we cannot build a working driver.

Comment 1 Jack Hammer 2005-03-29 15:39:40 UTC
New data ( 03/29/05 ) : 

The IPS Raid card always has a device ID 15 Type Processor.  This represents the
adapter itelf.  On the SMP kernel, this device does not appear in
/proc/scsi/scsi using a driver built natively from the kernel source.  On a
uniprocessor kernel, the driver built exactly the same way works and shows the
processor device.  

I believe this is another symptom of the same issue.

Comment 2 Jack Hammer 2005-03-30 14:02:54 UTC
IMPORTANT ! ! ! ! !

I may have found it.  Long story made short - maybe my fault - user error.

I will fill in details when I get it all sorted out - that may take a while.

Please do not burn any cycles on this issue until I get back to you.  Hopefully,
that will be to close this issue as a user error ....

Thanks.

Comment 3 Jack Hammer 2005-03-30 14:33:24 UTC
USER ERROR:  

When creating the source package ( rpmbuild ) , I did not specify the
architecture parameter. I have no idea why this would cause this behaviour in
SMP and allow the uni-processor version to work, but was definitely the cause.