Bug 113402 - ips driver too old for IBM e325 (AMD x86_64)
Summary: ips driver too old for IBM e325 (AMD x86_64)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact: Brian Brock
URL: http://dag.wieers.com/attic/ips/
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-01-13 16:32 UTC by Dag Wieers
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-12-03 02:16:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dag Wieers 2004-01-13 16:32:16 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; Galeon)
Gecko/20031202 Galeon/1.3.11a

Description of problem:
The default ips driver version is 6.00.00.
The needed ips driver version is 6.11.xx.

The default ips driver causes the following kernel-errors:

    Jan  6 14:32:14 dev-srv5 kernel:   Vendor: IBM       Model:
SERVERAID  Rev: 1.00
    Jan  6 14:32:14 dev-srv5 kernel:   Type:   Direct-Access    ANSI
SCSI revision: 02
    Jan  6 14:32:20 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache
    Jan  6 14:32:40 dev-srv5 kernel: scsi: device set offline - not
ready or command retry failed after host reset: host 1 channel 0 id 4
lun 0
    Jan  6 14:32:40 dev-srv5 kernel:   Vendor: IBM       Model:
SERVERAID  Rev: 1.00
    Jan  6 14:32:40 dev-srv5 kernel:   Type:   Direct-Access    ANSI
SCSI revision: 02
    Jan  6 14:32:46 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache
    Jan  6 14:33:06 dev-srv5 kernel: scsi: device set offline - not
ready or command retry failed after host reset: host 1 channel 0 id 5
lun 0

... ad infinitum...



Version-Release number of selected component (if applicable):
kernel-2.4.21-4.EL

How reproducible:
Always

Steps to Reproduce:
1. Just install Red Hat on an e325 with ServeRAID
2.
3.
    

Actual Results:  Loops on disks

Expected Results:  Works

Additional info:

This is a more detailed report of the problem:

- Software :
OS :Redhat AS 3 with updates

- Hardware :
eseries 325 with 2 internal harddisks
(has an LSI scsi raid ctrl onboard)
Serveraid 6M raid ctrl 
EXP400 external storage
--> 7 disks/machine  used and in non cluster config


The problem : 
The disks can not be accessed with the AMD64 ips driver.

This can be found in the logs :

Jan  6 14:30:30 dev-srv5 kernel: scsi1 : IBM PCI ServeRAID 6.11.07 
Build 2224 <ServeRAID 6M>
Jan  6 14:30:36 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache 
Jan  6 14:30:56 dev-srv5 kernel: scsi: device set offline - not ready
or command retry failed after host reset: host 1 channel 0 id 0 lun 0
Jan  6 14:30:56 dev-srv5 kernel:   Vendor: IBM       Model: SERVERAID
 Rev: 1.00
Jan  6 14:30:56 dev-srv5 kernel:   Type:   Direct-Access    ANSI SCSI
revision: 02
Jan  6 14:31:02 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache
Jan  6 14:31:10 dev-srv5 kernel: Error 13 while decompressing splash
screen.
Jan  6 14:31:22 dev-srv5 kernel: scsi: device set offline - not ready
or command retry failed after host reset: host 1 channel 0 id 1 lun 0
Jan  6 14:31:22 dev-srv5 kernel:   Vendor: IBM       Model: SERVERAID
 Rev: 1.00
Jan  6 14:31:22 dev-srv5 kernel:   Type:   Direct-Access    ANSI SCSI
revision: 02
Jan  6 14:31:28 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache
Jan  6 14:31:48 dev-srv5 kernel: scsi: device set offline - not ready
or command retry failed after host reset: host 1 channel 0 id 2 lun 0
Jan  6 14:31:48 dev-srv5 kernel:   Vendor: IBM       Model: SERVERAID
 Rev: 1.00
Jan  6 14:31:48 dev-srv5 kernel:   Type:   Direct-Access    ANSI SCSI
revision: 02
Jan  6 14:31:51 dev-srv5 login(pam_unix)[1195]: session opened for
user root by LOGIN(uid=0)
Jan  6 14:31:54 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache
Jan  6 14:32:14 dev-srv5 kernel: scsi: device set offline - not ready
or command retry failed after host reset: host 1 channel 0 id 3 lun 0
Jan  6 14:32:14 dev-srv5 kernel:   Vendor: IBM       Model: SERVERAID
 Rev: 1.00
Jan  6 14:32:14 dev-srv5 kernel:   Type:   Direct-Access    ANSI SCSI
revision: 02
Jan  6 14:32:20 dev-srv5 kernel: (ips0) Reset Request - Flushed Cache
Jan  6 14:32:40 dev-srv5 kernel: scsi: device set offline - not ready
or command retry failed after host reset: host 1 channel 0 id 4 lun 0
Jan  6 14:32:40 dev-srv5 kernel:   Vendor: IBM       Model: SERVERAID
 Rev: 1.00

While booting the machine will wait till the driver is loaded.
After a while (approx 1h) this stops and the disks are still not
recognised. 

The strange thing about the logs is that the separate drives are mentioned
while all these drives are configured as 1 logical drive.
Also the driver seems to also want to try the channel 0 while this channel
is not used (all drives in exp400 are external).


Note :

The 32bit version of Redhat AS3 on these machines does work perfectly with
all controllers and drives.
This would be an option if everything else fails but we would like to
avoid this.
dmesg on the 32 bit redhat version shows correctly the scsi ctrl and
EXP400 as the device (not the separate drives).

Comment 1 Tom Coughlan 2004-01-15 13:49:08 UTC
The ips driver is updated in U1 from 6.00.26 to 6.10.52.  The new
driver  adds support for AMD64 and should solve your problem.  Please
try U1 (it is expected to ship very soon) and let us know if it solves
the problem.

Comment 2 Dag Wieers 2004-01-15 14:56:21 UTC
Let me add that the problem is easy fixable by using the driver disk
that is included on the latest IBM ServeRaid CD. You can find this at:

   
http://www-306.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-495PES

or

   
http://www-306.ibm.com/pc/support/site.wss/document.do?lndocid=DRVR-MATRIX

Since it is not that easy to extract the binary module, it's adviced
to rebuild the included sources using the attached Makefile and
kernel-ver.c program for your specific kernel.

Just untar the ips tarball from the IBM ServeRaid CD into a *new*
directory, add the Makefile and kernel-ver.c to this directory. Make
sure you have the kernel-source package installed for the kernel you
want to build for, and type:

    make clean all install

or to specify a different kernel

    make clean all install KERNEL_VER="2.4.xx-yyy"

Prebuild kernel-module packages will be soon available from:

    http://dag.wieers.com/packages/kernel-module-ips/

PS Since I'm not able to attach files here (complains about user, and
when relogon, complains about No files attached) I'm linking to my
website:

    http://dag.wieers.com/attic/ips/

Comment 3 Dag Wieers 2004-01-29 14:33:36 UTC
The latest driver included on the x86_64 (AMD64) U1 CD does not work
with this ServeRAID adapter. During installation the module is loaded
and the following lines are printed:

    (ips0) Flushing Cache.
    (ips0) Flushing Complete.
    scsi : 2 hosts left

and the extra disks are _not_ attached and can't be seen/used.

After rebooting the server, the ips driver does work with proper output:

    (ips0) Flushing Cache.
    (ips0) Flushing Complete.
    scsi : 1 host left.
    scsi1 : IBM PCI ServeRAID 6.10.52  Build 563 <ServeRAID 6M>
    ...

The latest driver includes this changelog for the 6.x series:

/* 6.00.00  - Add 6x Adapters and Battery Flash
/* 6.10.00  - Remove 1G Addressing Limitations
/* 6.11.xx  - Get VersionInfo buffer off the stack !   DDTS 60401
/* 6.11.xx  - Make Logical Drive Info structure safe for DMA DDTS 60639

And for reference, the output with the latest driver is:

    (ips0) Flushing Cache.
    (ips0) Flushing Complete.
    scsi : 1 host left.
    scsi1 : IBM PCI ServeRAID 6.11.07  Build 2224 <ServeRAID 6M>

Thus we're unable to install Red Hat Advanced Server on an e325 with a
recent ServeRAID card. (But we can install it on the onboard scsi
controller using module 'mptscsih' though)

Use the procedure described above to compile your own ips module.

Comment 4 Ernie Petrides 2004-12-03 02:16:36 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-017.html



Note You need to log in before you can comment on or make changes to this bug.