Description of problem: Cannot do a grub-install to the hard disk after the Red Hat 9.0 is installed without getting a bogus stage2 file. I can do a grub-install to floppy with no problems however. How reproducible: Do an install without a boot loader and then do a grub-install afterwards. I have done it either before or after updates with the same results. Steps to Reproduce: 1. Load 9.0 on an AMD Intel clone machine. 2. Pick NO boot loader when loading the OS but create a raw boot floppy. 3. After Red Hat 9.0 (AMD or Intel) is up and running do a grub-install, picking the first hard drive's boot sector for the boot loader. I usually have Windows 2000 or Windows XP on the first drive, first partition, booting on the first drive, and a / partition as the first partition on the second drive. In that case I type: grub-install --force-lba '(hd0)' If I have a /boot partition I usually type: grub-install --force-lba --root-directory=/boot '(hd0)' I have tried it with and without the --force-lba parameter with no diff. Actual results: Usually the boot hangs when it hits the bogus stage2 file. I finally had my first time that the file actually worked tonight! Beats me why it worked this time since I had to back off to the earliest version of the BIOS to get the install to stop hanging (MSI 6330 - K7T Pro2, Ver. 1 motherboard) and then up the BIOS from Version 2.4 back to Version 3.5 after the OS was installed. There is something real strange going on there! Still, I copied my good stage2 file on there and things went much better. I saw all kinds of strange stuff with the bad stage2 file (garbled screen, etc) on there that disappeared when I put a copy of my stage2 file on there. Expected results: It to boot GRUB like a charm. I don't expect it to give me a menu unless I create a grub.conf file and pointer of menu.lst to it in /boot/grub, but I don't expect it to hang either! Even more puzzling is that it WORKS when I do a grub-install to a floppy. I don't have to do a thing! Additional info: The bogus stage2 file is the same size as the one in the /usr/share/grub/i386-redhat/stage2 file, but they are not the same file. There is ONE byte that is different at hex 211. The 0 becomes a 1. This seems to be very consistent. I even wrote a utility program called hexcmp to look at things that correlates with the way hexedit works (got tired of doing the base 10 to base 16 conversion all of the time for cmp). I wished I had not emulated cmp completely though. Instead of offsets it would be nice to just specify a size (same for both files).
I think I have given you everything you need. I don't have time to wander down through grub-install to find the problem. Usually people migrate their Windows 2K, XP, etc. to a new hard drive and I install Red Hat on their old drive after blowing a meg of zeros on to the start of it. If the machine doesn't sport a chip in the GHz range (or fairly close to it) I may make a primary /boot partition to be sure of no problems, but usually I make just a /, a swap (both of those primary on IDE) and then a /home and a smallish FAT32 partition for passing files between the two operating systems (both part of an extended partition). It has never been a problem of not having a /boot. It is always a bad stage2 GRUB boot file.
What's written to the hard disk is not supposed to be exactly the same as what's in /usr/share/grub -- minor changes are made during the embedding process to account for various geometry and other settings changes. Also, I have no problem doing this here. Does it work if you let the installer set up the boot loader instead of doing it by hand?
First off, I KNOW what is written into the stage2 file is not the vanilla version down in /usr/share/grub/i386-redhat. I wandered through the grub-install script enough to learn that. I just didn't want to take any more time on it. I am assuming you are now getting a lot of people who are real novices with Linux and Unix but want to look at it as a possible replacement for Windows machines with all of the viruses and worms that are hitting the Windows world. Microsoft down more than after September 11th 2001? They are using Linux cache servers in front of their W2K web servers to handle the update load? Sheesh! I will bet there are some unhappy people in Redmond right now. This is going to be long winded. First, don't ask me to put on GRUB at OS install time. The dual-boot Windows 98/2000 systems my friend keeps churning out every 3-5 months as he upgrades and the people that get them from him ask me to put Linux on for them are some of the machines I have installed Linux on and doing it that way toasts them! It is tricky and I have to install GRUB AFTER I install the OS with judicious use of hexedit to make things work. Second, I don't even have the proverbial fdisk to do a fdisk /mbr if I remove Linux later on for some people. What I do is dd the original few Kilobytes off of /dev/hda to put back to boot to only Windows (or Windows 98/2000 for those unlucky people buying my friend's older systems) again later on. For the Windows 98/2000 machines, the fdisk /mbr won't work anyway. It toasts them. You can boot to Linux with your boot floppies but that is it. They have different boot loader code in the first 512 bytes than a normal Windows machine. The way I have to set them up is have GRUB boot either Linux or load the Windows boot loader/chooser that picks between 98 & 2000. If you like mucking around, go ahead and do it. I am getting tired of it. I won't install Linux for them any more if they have one of those stupid combo 98/2000 jobs. What a pain in the arse! I don't care if they use just Windows 2000, Windows XP, or Windows 2003 (they can take their pick - I will work with any one of them individually), I will install Linux for them. No more double Windows OS machines for me! The size of the stage2 file you get at OS load time is a bigger than the one you get with grub-install. Here is the size of the various stage2 files I got when I installed GRUB: 106364 fpstage2.gandalf 106364 fpstage2.oscar 130340 stage2 # this one is on my machine 106364 stage2.GRUB_install.BAD 130340 stage2.iggy And since we are not battling crackers, their checksums via sum: 59415 104 fpstage2.gandalf 59415 104 fpstage2.oscar 51654 128 stage2 # my machine - gandalf 05584 104 stage2.GRUB_install.BAD 61612 128 stage2.iggy # iggy has a /boot and # I hand edited the file # He also has Windows 98 # and 2000 with RH 9.0 The two "fpstag*" files were generated by grub-install on floppies on two machines that have the same partitioning (/, /home, SWAP). I only make a /boot partition (which must be a primary to be bootable) on older systems where I have my doubts about how far into the drive I can go. I boot FreeBSD so far past the 1024th cylinder on my own machine it is pathetic. Try cylinder 3000+! The stage2 file is the one that is on my machine that was generated by grub at install time. The stage2.GRUB-install.bad file was generated by a grub-install to the hard drive AFTER the OS was installed on a machine named oscar (sitting next to gandalf right now). It matches the size of the floppy's stage2 files but is slightly different as this comparision shows: [hhhobbit@gandalf grubby]$ hexcmp -l fpstage2.oscar \ stage2.GRUB_install.BAD 1f0 f8 0 1f4 b8 0 # locations generated to match hexedit 1f7 b 0 # the hexedit program was written by me 1f8 df 2 1fc 17 cf 22b 20 0 Even though the bad stage2 file generated for the hard drive is the same size as the one on the floppies, if you replace the stage2 files on the floppies with the bad stage2 file on the hard drive, GRUB will not load! It hangs! How oscar strangely booted from the hard drive I will never know. I didn't try it twice. I made a copy of oscar's stage2 file (stage2.GRUB_install.BAD), and replaced it with my stage2 file. oscar has been loading fine since then. What I have been doing is just giving people my stage2 file and calling it a day. Most of the machines I work with now can boot past the 1024 cylinder boundary so I normally don't make that piddly little /boot any more which MUST be a primary partition. Somebody should be writing something some place that says the boot area MUST be a primary partition. I have yet to get a logical to boot. But I have never read that tidbit of wisdom any place. I learned it the hard way! If I have a /boot partition my grub install command is (assuming Windows is on the first drive, Linux being put either on the first or the second drive): [root@XXX /]# grub-install --force-lba --root-directory=/boot '(hd0)' and if I have /boot as part of /: [root@XXX /]# grub-install --force-lba '(hd0)' I can't make a copy of my first CD but I assume that has nothing to do with this problem. I just thought you should know that. Just in case that makes any difference NOW I will give you the MD5 checksum of the generic stage2 file down in /usr/share/grub/i386-redhat: [hhhobbit@gandalf tmp]$ cd /usr/share/grub/i386-redhat [hhhobbit@gandalf i386-redhat]$ md5sum stage1 0e83fbc85ce5d216136331fde744dd81 stage1 [hhhobbit@gandalf i386-redhat]$ md5sum stage2 00b3fbf9c51743718602215e99c62b22 stage2 I am not going to bother with a check sum of grub-install. If it runs, it is probably fine, and it does generate bootable stage2 files for floppies. HHH
WOOPS! I wrote the hexcmp program, not hexedit.
I don't know whether this is in any way related but today my machine could no longer GRUB-boot from its first harddisk (I have a dual IDE for software mirror RAID). Using a GRUB boot floppy I empirically find that some file in /boot can no longer be 'found' by Grub on (hd0) though they show up if you 'tab' after the find command. Unfortunately, one of these files is 'stage1'. D'oh. 'stage1' is 'find'-ed without problem on (hd1). There goes my Saturday night. Could there be a problem with the 1.5 stage part? I will open a separate bug report if necessary....
Red Hat apologizes that these issues have not been resolved yet. We do want to make sure that no important bugs slip through the cracks. Red Hat Linux 7.3 and Red Hat Linux 9 are no longer supported by Red Hat, Inc. They are maintained by the Fedora Legacy project (http://www.fedoralegacy.org/) for security updates only. If this is a security issue, please reassign to the 'Fedora Legacy' product in bugzilla. Please note that Legacy security update support for these products will stop on December 31st, 2006. If this is not a security issue, please check if this issue is still present in a current Fedora Core release. If so, please change the product and version to match, and check the box indicating that the requested information has been provided. If you are currently still running Red Hat Linux 7.3 or 9, please note that Fedora Legacy security update support for these products will stop on December 31st, 2006. You are strongly advised to upgrade to a current Fedora Core release or Red Hat Enterprise Linux or comparable. Some information on which option may be right for you is available at http://www.redhat.com/rhel/migrate/redhatlinux/. Any bug still open against Red Hat Linux 7.3 or 9 at the end of 2006 will be closed 'CANTFIX'. Again, if this bug still exists in a current release, or is a security issue, please change the product as necessary. We thank you for your help, and apologize again that we haven't handled these issues to this point.
Red Hat Linux 7.3 and Red Hat Linux 9 are no longer supported by Red Hat, Inc. f you are currently still running Red Hat Linux 7.3 or 9, you are strongly advised to upgrade to a current Fedora Core release or Red Hat Enterprise Linux or comparable. Some information on which option may be right for you is available at http://www.redhat.com/rhel/migrate/redhatlinux/. Closing as CANTFIX.