Description of problem: grub-install fails, claiming that the cylinder is not supported by the BIOS. This failure is not detected by anaconda(!) so the result of any install on a box with only SCSI controllers will be a non-bootable system! Version-Release number of selected component (if applicable): How reproducible: absolute: fails with both aic79xx and 3w-9xxx controllers. Steps to Reproduce: 1. Obtain machine with no IDE/ATA/SATA drives 2. Install RHEL4 Actual results: System is not bootable Expected results: System is bootable Additional info:
*** Bug 149064 has been marked as a duplicate of this bug. ***
To give some more info, I just installed on a system with 2 3w-9xxx controllers. When it tries to reboot, *no* messages appear on the screen -- after the BIOS summary screen all you get is a blinking cursor. I booted into rescue mode using the install CD, did a 'chroot /mnt/sysimage', and tried 'grub-install /dev/sda'. It returned the error "The file /boot/grub/stage1 not read correctly".
I think the problem in my case is that the boot "drive" is >2TB. Each 3ware has 12 300GB drives on it configured as RAID5 with a hot spare. Was that your issue as well? If not, then I'll file it under a separate bug. FWIW, lilo boots this system just fine.
Marty, what's the exact error message grub-install is printing? Also, what are the sizes of the disks attached? Joshua's theory may have some credence; devices larger than 2TB are not supported to boot off of. This may or may not be the problem; hard to tell without more data. Can you please attach the files /boot/grub/grub.conf , /boot/grub/device.map , and /proc/partitions to this bug report?
Yes, Joshua is probably correct. Unfortunately, the machine is no longer available for me to hack on (it's in production now). The initial boot drive was 11 400GB drives in RAID5 with a hot spare, so it was definitely >2TB. However, the primary controller is now the onboard AIC-7902 with 2 36GB drives (in software RAID1), and grub failed on that in the very same manner (as Joshua documented in comment #3). Fortunately, lilo had no issue with the AIC-7902 drives -- at least running from rescue-mode -- so that's why the box is now in production. I *think* we'll be getting another similarly configured box, and if I can "steal" it to do some testing for a day or so, I'll drop a note here (I'm guessing "within about a month"). I think it is very important that anaconda catch this problem, though. That's the part that had me *really* frustrated. Had I known that grub failed, I would have immediately been able to try lilo on the 4TB 3ware array (and I'm guessing it would have worked, but I can't be sure).
Neither grub nor lilo will reliably boot from a >2TB device. Sure, lilo will boot once -- it will then happily trash your partition table. The problem is that msdos disk labels don't support >2TB, and neither grub nor lilo support gpt labels. I'm guessing that the issue you had above had to do with drive ordering -- grub tried to install to the big array even though the SCSI mirror was there. To get my system to be able to reboot reliably, I had to make sure the motherboard BIOS tried to boot first off the internal hard drive I put in (I've got a 2 port 3ware and a couple of WD Raptors on order) and that I put gpt labels on my arrays as well as at least one partition. Only then could I reboot the system (using grub) and have it come up with all the partitions intact.
In that case, trying lilo on the 4TB drive would have also been very frustrating -- at least for the 2nd and subsequent boots. :( But now that I know about all this, I'll be prepared when I assemble the filestore / near-line backup box for home. This issue would probably have driven me mad on my own box, as I would not have gone for the option of any other drives. Thanks, Joshua! I just read up on GPT drives, and that sounds like something that the grub folks ought to be looking at.... Anyway, there's still 2 outstanding issues here. 1: anaconda did not catch grub's failure; 2: grub needs to ascertain the BIOS boot order & put the boot block on the right drive (if comment #7 is right about why grub failed on the AIC-7902 drives).
Created attachment 113183 [details] /boot/grub/device.map
Created attachment 113184 [details] /proc/mdstat
Created attachment 113185 [details] /proc/partitions
Created attachment 113186 [details] /boot/grub/grub.conf
NEEDINFO_PM has been deprecated. Changing status to NEEDINFO and changing ownership to pm manager.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.
Re-opening and placing in 4.6 eval queue based on Peter's comments in #26.
This bugzilla had previously been approved for engineering consideration but Red Hat Product Management is currently reevaluating this issue for inclusion in RHEL4.6.
Booting from disks >2TB is not going to work in RHEL4 (or current RHEL5, for that matter). You can work around it by exporting a second, smaller LUN to boot from.
This issuing is being closed as this is late in the RHEL 4 cycle and would be too disruptive. Additionally, there is a reasonable work around noted in comment #33.