Since you try to market RH as a server OS, maybe you should test it on the common rack servers that people use it on? :-) Data points: RH 9 fails to install because it configures /dev/sda but the kernel/fdisk is unable to access it (this is something that I've never seen before) RH 8 installs fine. First guess: The RH 9 kernel causes issues with ioapic/aic78xx/whatever So, I spent quite a bit of time hacking the RH 8 kernel and boot modules into the RH 9 installer Well, that failed miserably in the same fashion, so it is somehow the RH 9 installer that wedges the system (even in expert/noprobe mode) so that fdisk /dev/sda fails (the device is there, fdisk just can't talk to it) Solution: I then hacked the RH 8 installer to install the RH 9 RPMs instead. Well, that just worked like a charm. Of course, I have no idea what, in the RH 9 installer wedges /dev/sda, especially since the RH 9 install still fails with the RH 8 kernel and modules.
After the installer inserts the driver for the scsi controller, if you switch to VC2 and type 'less /tmp/syslog' does the driver show up in the kernel log? Does the drive show up?
Sorry, yes, I should have specified that. The driver loads and the drives show up in the SCSI detection. I've tried this with both the raid card (megaraid driver) and without (aic7xxx). In both cases, the respective scsi driver scans the bus, sees the devices, the installer creates /dev/sda, but any access to it fails. Note that it shouldn't be a driver problem since the RH 8 kernel and driver modules do fail in the same way in the RH9 installer, while they work with the RH 8 installer.
If you get into the RHL 9 installer and got VC2 and run parted on /dev/sda does it work ok?
I tried both fdisk /dev/sda and even cat /dev/sda They both failed even though the device existed and /dev/sda was shown in dmesg after the SCSI driver got loaded
Ok this is most likely a kernel issue then if those commands will not work. One last test - if you boot with 'linux rescue' and go into rescue mode w/o mounting any existing filesystems does the parted /dev/sda work?
Please re-read my comments, I'm pretty sure it's not related since I forward-ported the RH 8 install kernel to the RH 9 install floppy and second stage loader, and it still refused to install. The RH 9 kernel also works fine once I get it installed with the RH 8 installer. As for parted, 1) I didn't have any partitions mounted when I did my tests since the RH 9 installer could not access my drives/raid array 2) I can't do your tests anymore, I already lost more than one week due to this loading problem and having to fix your installer and the machines are in production now. I can't reboot them or re-install them. I'm quite surprised that your QA departement doesn't have a sample of the most common rack servers that people are likely to use. You should go buy some, or have DELL/IBM send you one of each (I'm serious)
I believe this configuration should work based on other feedback we've received but to be sure Jay has offered to verify it on one of our 2650's.
Michael, thanks for looking into this. Did you find anything out?
I got more info (I think). Apparently, the server installs if you use a CD with all the drivers, but it fails if you use a floppy which brings up the broadcom gige, does DHCP, retreives stage 2 via NFS, and loads the SCSI modules from the stage two image. The modules do load though, and the SCSI/RAID is detected. For whatever reason, however, it does not register /dev/sda with the kernel so any partitioning attempt after that, fails. I haven't double checked, but _maybe_ the modules from the iso CD and the ones in the second stage loader aren't identical.
Another option: When you boot from an ISO, modules may get loaded in a different order than if I boot a floppy that initializes the ethernet first, and then loads the AIC/megaraid drivers after they've been retreived from stage 2. Maybe there is a subtle bug when the scsi driver doesn't recognize the drives if it's initialized after the ethernet or some other thing that happens first in the installer
I'm having a similar problem to this. I've managed a full install via CD, but installing via NFS causes a installer crash just after the NFS mount. I can install by FTP but this takes ages (I think it was 24 hours for a full install). I set the IP configuration manually (I don't use DHCP for installing servers). I suspect the tg3 driver myself. The 2650 I'm using has BIOS A16, ESM 1.01, PERC 3/DC 1.92 BIOS 3.3.1, ie it has four drives on a RAID controller. I can do some tests as the machine isn't "live" yet.
Issue is getting closed out in some spring cleaning as a result of RHL9 no longer being a supported product. Please reopen with comments if you are continuing to see this issue with RHEL releases or Fedora.
The server I tested this on is now running RHEL AS and I had no problems installing it with kickstart.