Red Hat Bugzilla – Bug 112844
cciss 2.4.49 driver panic with HP Smart Array
Last modified: 2007-11-30 17:06:53 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703
Description of problem:
The cciss 2.4.49 driver included in the 2.4.9-e.34 kernel causes a
boot-time kernel panic with certain HP Smart Array firmware revs.
Console output during boot (first line is for context):
NET4: Unix domain sockets 1.0/SMP for Linux NET 4.0.
request_module[block-major-104]: Root fs not mounted
VFS: Cannot open root device "cciss/c0d0p1" or 68:01
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 68:01
This has occurred on a Proliant DL380 G1 with a Smart Array 5304/256
firmware 2.92 and a DL380 G2 with onboard Smart Array 5i firmware
2.38. The problem is definately the 2.4.49 cciss.o driver in the
initrd image, because replacing it with cciss_2445.o fixes things.
However, the original 2.4.49 does work after upgrading the 5304
firmware to 3.40 (I'm unable to upgrade the 5i at present). If this
firmware dependency is by design then it would be helpful to document
it in the RHSA-2003:408 announcement.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Upgrade to kernel-2.4.9-e.34 with one of the above controller revs
Actual Results: Kernel panic
Expected Results: No kernel panic
Created attachment 96752 [details]
lspci -nvv output for failing SA-5i
There is no intended dependency on the f/w version. I'll diff the
drivers and check what has changed.
I have also requested that our vendor relations manager assign this
to a test team for further investigation.
I checked the differences between the drivers and there is no
apparent reason why 2.4.49 should fail. Please send some config info
such as the number & type of controllers in the system. Are there any
drives on the embedded controller? Is the embedded running as dumb
SCSI or as cpqarray?
Does the driver load during boot? Are there any error messages when
the driver loads?
I'm guessing this happens when rebooting after the install completes.
If so, rescue the system and check /etc/modules.conf for entries for
both versions of cciss. If they exist, delete the entry for the older
driver and build a new initrd to see if that helps.
This has been assigned to a test team for investigation.
We also have this problem. It's a Compaq 5304 controller. The normal
sym53c8xx driver does NOT load, inducing the Kernel panic. I've
switched back to 2.4.9-e.27 for now (which continues to work perfectly).
Sorry I should have said, this is RedHat AS2.1. Let me know via e-mail
if you need any more information. The controller is running 2 arrays
and is the sole RAID controller in the machine.
It turns out the firmware upgrade was a red herring. The problem is
that upgrading from kernel-2.4.9-e.24 (the 2.1 AS Update 2 default) to
2.4.9-e.34 fails to create an initrd image due to the alias line in
/etc/modules.conf referencing cciss_2427 (the -e.34 install even
prints the warning "No module cciss_2427 found for kernel 2.4.9-e.34"
from mkinitrd, but I'd somehow missed that). So when you boot with the
new kernel there isn't an initrd to load cciss.o from, hence the panic.
Commenting out the reference to cciss_2427 in /etc/modules.conf prior
to installing -e.34 (or -e.35, which behaves the same way) fixes the
problem. I think at some point during my troubleshooting I changed
this to reference cciss_2445 and reinstalled -e.34, which allowed it
to create an initrd - because I hadn't noticed its absence first time
round, the firmware upgrade seemed to be the only difference between a
system that booted and one that didn't.
Sorry to have misled you, and a big thank-you to the person who
e-mailed me with this suggestion (you know who you are).
I'm also experiencing the same problem. We're using the 5300
controller in DL380 G1's. I've flashed to 3.54, no good. I've tried
to use cciss.o and cciss_2445.o on scsi_hostadapter2, both result in a
kernel panic. Besides the fact that I can't fix it, e35 assigns
cciss_2427 to the controller, even though 2427 doesn't even exist in
that kernel. What's up with that?
To be precise, it's a 5302 controller. And to clarify, this is using
the e35 kernel.
Is the kernel panic you see the same as the one described above?
If so, then you need to re-make the initrd after you change
scsi_hostadapterX in modules.conf. If you already did that, or you are
seeing a different panic, then please provide more detailed information.
The way this was supposed to work is:
1) notice in the release notes that the cciss_2427 driver has been
2) edit modules.conf and change cciss_2427 to cciss.
3) then update the kernel.
We will update the release notes to make this requirement explicit.
In the future this should not occur very often, because we will not be
using the driver_nnnn names as the default driver.
Yes, the kernel panic was exactly the same. As I mentioned, I've
tried it with both the cciss and cciss_2445 (which works with e34)
drivers. Yes, I rebuilt the initrd each time.
What other "detailed information" would you like?
The exact steps I took in testing this:
1) Built server via kickstart to e30.
2) Patched everything except kernel packages.
3) Downloaded e34 UP and SMP kernels.
4) Installed e34 UP and SMP kernels (-ivh).
5) Modified modules.conf, changing cciss_2427 to cciss.
6) Built initrd for e34 UP and SMP.
7) Rebooted into kernel panic.
8) Modified modules.conf, changing cciss to cciss_2445.
7) Rebuilt initrd for e34 UP and SMP.
8) Rebooted successfully.
9) Downloaded e35 UP and SMP kernels.
10) Installed e35 UP and SMP kernels (-ivh).
11) Built initrd for e35 UP and SMP.
12) Rebooted into kernel panic.
13) Modified modules.conf, changing cciss_2445 to cciss.
14) Rebuilt initrd for e35 UP and SMP.
15) Rebooted into kernel panic.
So, as you can see, I've tried all possible permutations between the
e34 and e35. These are all on an SMP DL380-G2 (yeah, I know, why did
I build the UP?).
FWIW, I just commented out the cciss_2427 line altogether, then
reinstalled e.34 and the problem Went Away.
With regard to Tom's comments, what tripped me up was the jump from
(1) to (2). Because I hadn't explicitly enabled cciss_2427, I wrongly
assumed that its removal was irrelevant to my setup (the initial
update of a freshly installed system).
Not automatically adding cciss_nnnn to modules.conf will certainly
help, but I'd also suggest adding a check on the result of mkinitrd
and an explicit "Initial ramdisk not created, your system might not
boot" warning message. This would defend against any future problems
of this type by clearly flagging them as install failures rather than
Typo in my last post. These are DL-380 *G1* servers, as mentioned in
my previous posts.
Update. I've also tried commenting out the cciss_2427 line as Simon
suggested (rather than changing it) on the e35 kernel. Same kernel panic.
After being contacted by an HP engineer, I rebuild the initrd without
the cpqarray driver. The e35 kernel now boots successfully, but no
support for the integrated Smart Array controller (not acceptable).
What is the status on this? New builds on the e34-e37 kernel still
fail. The only method for fixing this is to boot into a working
kernel (e30 or older), comment out any instances of cpqarray and
cciss_24xx, and mkinitrd. This is NOT acceptable for
automated/kickstart build environments!
*** Correction ***
This does NOT work on e37 kernels. Using the same 5300 controller, I
booted into the e30 kernel, edited modules.conf to remove cpqarray and
cciss_2427 support, build new images with mkinitrd, and it boots into
a kernel panic.
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
request_module[block-major-104]: Root is not mounted
VFS: Cannot open root device "cciss/c0d0p3" or 68:03
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 68:03
The e35, e37, and e38 kernels work when all older modules are
commented out, the initrd is rebuilt, and the bootloader is edited to
load the initrd (*blush*). This still doesn't fix the fact that the
modules.conf is broken post-errata with erroneous entries for
cciss_2425 or cciss_2445.
The 6402 controller does not work with this driver. Attempting an
install on the 6402 finishes successfully, but panics on reboot.
Tried with both update2 and update3. Both 2.4.45 and 2.4.49 end in
panic. This is on a DL380 G3 with the following:
Integrated Smart Array 5i
Compaq Smart Array 6400 (v1.32)
HP BIOS P29 10/31/2003
> This still doesn't fix the fact that the modules.conf is broken
> post-errata with erroneous entries for cciss_2427 or cciss_2445.
In fact there are two problems with the cciss_nnnn entries. First, as
noted they stop mkinitrd working during an upgrade. Second, even
without upgrading they can cause the incorrect driver to be loaded in
a system with multiple CCISS devices.
I encountered this on a DL380 G3 with its onboard SA-5i and an added
SA-5304. After installing 2.1 AS Update 2 (e.24 kernel),
alias scsi_hostadapter cciss
alias scsi_hostadapter1 cciss_2427
alias scsi_hostadapter2 cciss_2427
scsi_hostadapter2 is outright bogus as discussed, but the entry for
scsi_hostadapter1 would have silently worked - with the old driver.
So please, Red Hat, deliver us from this cciss_nnnn madness: let the
admin decide if they want to load old drivers instead of doing it for
Please be aware there is an additional problem with cciss (this is in
response to Jason Dixon's note). When multiple Smart Array controllers
are installed in a server, and the first one discovered by cciss is
NOT the boot controller you get a Kernel panic.
I've even tried mounting the partitions using EXT2 labels - still no
joy. This is a real issue when you DONT want to boot from the 5i but
still want to use it for other things.
The usual errors are when initrd tries to pivotroot into the live
environment. I found an entry for this bug on the HP site but couldn't
find an associated RedHat Bugzilla.
Here's the URL for the "Multiple Smart Array" problem on the HP site:
Update to my previous comment on the 6400. This is an unrelated
issue... it appears the 6400 will not work when using grub. Opening a
separate bug report...
Ok, two months have passed since the last post. What has Red Hat done
to advance this issue?
I don't know. I really enjoy being 6 patches behind on my RHES 2.1
systems because of this. I guess I will try e40 and see if perhaps
There may be multiple issues intermixed in this report. I'll address
the one that I think is at the center of this. If there are other
issues, please clarify.
We have found that when AS 2.1 U2 is installed, kudzu puts two lines
alias scsi_hostadapter2 cciss
alias scsi_hostadapter3 cciss_2427
The second line is an unintended consequence of the fact that the
cciss_2427 driver calls MODULE_DEVICE_TABLE, and thereby gets listed
in the file /lib/modules/2.4.9-e.whatever/modules.pcimap. Under
certain circumstances, Kudzu refers to this file and erroneously adds
cciss_2427 to modules.conf.
The cciss_2427.o file is removed in AS 2.1 U3 and later kernels, so an
upgrade will fail on any system that has cciss_2427 in modules.conf.
The solution to this problem is to remove the cciss_2427 line from
modules.conf before doing the upgrade. If you are using kickstart,
you should be able to do this in a %pre section in the kickstart file.
In the future we will ensure that kudzu never puts module_nnnn files
in modules.conf. If there are versions of kudzu in the field that do
use these module names, then we will preserve these files in the
kernel, so upgrades will continue to work.
I am having problems while installing the
kernel-enterprise-2.4.9-e.49.i686.rpm in a DL 740 with AS 2.1
[root@dc10bd rhn-packages]# rpm -ivh kernel-enterprise-2.4.9-e.49.i686.rpm
No module cciss_2427 found for kernel
Any ideas ?
Look at /etc/modules.conf. If "cciss_2427" is mentioned in there,
replace it with "cciss". Then try the rpm -i again.
Remove the entry for cciss_2427 from /etc/modules.conf.