Red Hat Bugzilla – Bug 87509
Dual Xeon, Supermicro P4DC6+, adaptec SCSI RAID 5, installs ok, reboot, kernel panics (unable to mount root FS)
Last modified: 2007-04-18 12:52:30 EDT
Description of problem:
I built a system with a supermicro P4DC6+ motherboard, dual Xeon 2.2ghz
processors, adaptec 2005s ZCR Raid card, 4 Quantum SCA 36GB hds, and 1024MB ram.
I have attempted (about a hundred times) to install several different versions
of Redhat, including 7.3 (+ errata updates) and 8.0. Every single time, no
matter what I do during installation, or what boot parameters I give the
kernel, I get the exact same behavior-
The installation process completes just fine, and the system goes for reboot...
on boot, it goes through and does the hardware detection, finding the cdrom,
the floppy, etc. At this point, on this and all subsequent reboots, I get the
exact same error message:
loading scsi_mod module
kmod failed: /sbin/kmod -k -s block-major-8, errno=2
VFS: cannot open root device on "sda2" or 08:02
please append a correct "root=" boot option
Kernel Panic: VFS: unable to mount root FS on 08:02
However, if I boot to the "linux rescue" or to a boot disk, booting is fine,
and it mounts my sysimage, and I can access all my mounted partitions,
including / and /boot. The failure ONLY occurs on a normal boot, and it occurs
when choosing either the "2.4.18-3SMP" or the "2.4.18-3" kernel from LILO or
Now, through the course of all my investigations, I have determined the
1. the dpt_i2o driver (which takes the place of the generic aic7xxx scsi
driver) is the appropriate driver (according to adaptec) for the raid scsi sub-
system. I confirm this by seeing when the "rescue" or "boot disk" kernels
load, they refer to loading the dpt_i2o driver, and all works fine. I however
do not see the "dpt_i2o" driver referred to during normal boot, I suspect
because the boot is failing too early, though it would make sense that it would
need to load earlier to mount the root FS.
2. in /etc/modules.conf, I do see:
alias scsi_hostadapter dpt_i2o
so, I know the installation is telling the kernel to load the correct scsi
driver. I do not know where the "loading scsi_mod module" line is coming from
in the boot process, but it appears right after the line about "RedHat Nash"
starting, so maybe its in there?
3. There are hundreds of newsgroups that I have found on this subject, most of
which seem to hint at the fact that maybe the RedHat installation is not
correctly building in the dpt_i2o driver into the kernel, and that the system
is then trying to load the generic "block-major-8" scsi driver instead, which
fails to give access to the root FS. also, hints have been that "loading
scsi_mod module" followed by kmod means its trying to load the scsi driver as a
module, from the not-yet-mounted FS, so it obviously can't access the driver
from an FS that isn't loaded yet.
4. I have tried about every possible kernel command out there,
like "apic", "noapic", "noapm", and a whole host of others, turning off power
APM modes, forcing single proc, etc etc etc. None have any affect whatsoever.
5. I have however found several different listings from people who HAVE
successfully installed redhat on nearly identical configurations (motherboard,
etc), and I was very careful to check all the hardware compatibility lists for
any known conflicts before venturing in this direction.
6. I also updated the motherboard's bios to the latest rev, 1.2c.
7. I found several newgroups indicating that maybe "hyperthreading" was causing
a problem, so I disabled that. no go. I have stripped down the system
completely, taking out the NIC, swapping video cards, taking down the system to
one processor, taking out the RAID card (and installing to just one SCSI hd),
disabling the SCSI subsystem all together (and trying to install to an IDE hd).
None of this has made any difference in solving my problem!
I have spent hours on the phone with redhat support, with adaptec support, and
now with supermicro tech support. Everyone seems to be baffled as to what might
be the cause. It seems to me that a default install onto hardware which DOES
otherwise work with other OS's like microsoft, and getting this error
persistently, it in my mind seems like it has to be some bug or undocumented
I would certainly appreciate some guidance in this, I'm at the end of my rope
and my neck is on the line with my employer to figure this out!
Version-Release number of selected component (if applicable):
kernel-2.4.18-3SMP and kernel-2.4.18-3
Steps to Reproduce:
1. Install Redhat 7.3 or 8.0
3. choose either SMP or -up Kernel (from either GRUB or LILO)
booting began as normal, detecting the cdrom, floppy, etc. It gets to the line
that says RedHat Nash starting, then things go haywire.
the booting should have continued, mounting the root FS on /dev/sda2
This system has the supermicro P4DC6+ motherboard, with dual Xeon 2.2ghz's,
1024MB ram, the Adaptec 2005s ZCR Raid card, and 4 Quantum SCA 36GB hds. The
scsi disks are setup to be a RAID-5 array, with about 105 GB in the array. I am
partitioning the array as follows:
/dev/sda1 /boot 128 MB
/dev/sda2 / 3000 MB
/dev/sda4 /usr 10000 MB
/dev/sda5 /tmp 1000 MB
/dev/sda6 /home 5000 MB
/dev/sda7 /swap 2048 MB
/dev/sda8 /var 85300 MB
What's the output of `/sbin/mkinitrd -v -f /tmp/initrd.test 2.4.18-3smp` if you
boot into rescue mode and run it from chrooted in /mnt/sysimage
modules: ./kernel/drivers/scsi/scsi_mod.o ./kernel/drivers/scsi/sd_mod.o ./kerne
using loopback device /dev/loop1
/sbin/nash -> /tmp/initrd.ZQF4K3/bin/insmod
Loding module scsi_mod
Loading module sd_mod
Loading module dpt_i2o
Do you see the dpt_i2o module loaded when you boot? Does it find the drives
Created attachment 90851 [details]
JPG screen shot of the first page show during a normal (non-rescue) boot...
Sorry this one is so blurry, something weird happened in the conversion... i
tried adjusting the colors to make it somewhat possible to make out the words
(don't stare too long it'll hurt your eyes)... you can make out things like the
"kswapd" and "apm disabled - amp not SMP safe" and the "PIIX4" lines referring
to loading IDE stuff right before it recognizes the CDROM on the IDE bus.
Created attachment 90852 [details]
JPG screen shot of the next page shown during a normal (non-rescue) boot...
This one is quite a bit clearer (for some weird reason) and you can make out
most of what's happening, including the RAMdisk call and the "md" driver
loading and detecting the RAID arrays and disks, then farther down, RedHat Nash
starts, then VFS mounts root (as ext2), then "Loading scsi_mod module" then it
continues with the Kernel panic error I originally posted.
so, based on what I see in those screen shots, as well as what I see on the
screen in person, I do not see any reference to the "dpt_i2o" driver being
loaded, but do see references to "scsi_mod" and "md" (which I am wondering if
that is the same as the sd_mod)... however, as I mentioned before, during the
rescue boot and the boot-disk boot, the blue text-GUI screen pops up (shortly
after the place that the kernel panic is happening in the regular boot) and
refers to "Loading dpt_i2o"...
i figured it would be easier to give you the screen shots (blurry or not) to
show you where in the boot process the error occurs, cause I am not familiar
enough with it to understand all of what I'm seeing.
So it's been 9 days since my last post, and I still haven't heard back from you
all. Do you need more information? What can I do?
It's been a month, and I still haven't heard from anyone... what's going on,
have you all given up on this problem? What can I do to get a resolution to it?
Does the new initrd (/ newer kernel erratas) work any better?
well, about a month ago, we decided (after trying everything else and having no
luck) to try to install windows on the machine, and had similar problems with
its install. That led us to believe there was some hardware problem. After
swapping in and out everything else we could, we were left with trying
different RAM. We bought 4 256MB chips of ram (to replace 2 512MB chips and 2
CRIMMS). Supermicro indicated that this board would accept up to 512MB chips,
but the memory vendor we purchased from said that the board should only be used
with 256MB chips. So we bought those, installed them, and subsequent installs
of RedHat and Windows worked flawlessly.
To me this indicates either corrupt RAM chips (which I have no way to test) or
a flaw in how Supermicro lists the specs for that motherboard. In either case,
apparently the root of the problem was RAM related, and had nothing to do with
the OS. I do apologize for the miscommunication of it being a bug to you.
I wish there was some way for an OS installation (which all along was going
just fine, and the problem occured only on boot AFTER a successful install) to
detect the hardware problem instead of it just waiting for the first boot
before the problem occurs. BUT, microsoft's installations were just hanging in
the middle, without even a boot, so I guess it's not something I could expect
of either OS. <shrugs> Oh well, thanks anyway.