Red Hat Bugzilla – Bug 108059
Last modified: 2007-04-18 12:58:47 EDT
Description of problem:
Cannot do a grub-install to the hard disk after the Red Hat 9.0 is installed
without getting a bogus stage2 file. I can do a grub-install to floppy with no
Do an install without a boot loader and then do a grub-install afterwards. I
have done it either before or after updates with the same results.
Steps to Reproduce:
1. Load 9.0 on an AMD Intel clone machine.
2. Pick NO boot loader when loading the OS but create a raw boot floppy.
3. After Red Hat 9.0 (AMD or Intel) is up and running do a grub-install,
picking the first hard drive's boot sector for the boot loader.
I usually have Windows 2000 or Windows XP on the first drive, first
partition, booting on the first drive, and a / partition as the first
partition on the second drive. In that case I type:
grub-install --force-lba '(hd0)'
If I have a /boot partition I usually type:
grub-install --force-lba --root-directory=/boot '(hd0)'
I have tried it with and without the --force-lba parameter with no diff.
Usually the boot hangs when it hits the bogus stage2 file. I finally had my
first time that the file actually worked tonight! Beats me why it worked this
time since I had to back off to the earliest version of the BIOS to get the
install to stop hanging (MSI 6330 - K7T Pro2, Ver. 1 motherboard) and then up
the BIOS from Version 2.4 back to Version 3.5 after the OS was installed. There
is something real strange going on there! Still, I copied my good stage2 file on
there and things went much better. I saw all kinds of strange stuff with the bad
stage2 file (garbled screen, etc) on there that disappeared when I put a copy of
my stage2 file on there.
It to boot GRUB like a charm. I don't expect it to give me a menu unless I
create a grub.conf file and pointer of menu.lst to it in /boot/grub, but I don't
expect it to hang either! Even more puzzling is that it WORKS when I do a
grub-install to a floppy. I don't have to do a thing!
The bogus stage2 file is the same size as the one in the
/usr/share/grub/i386-redhat/stage2 file, but they are not the same file.
There is ONE byte that is different at hex 211. The 0 becomes a 1. This
seems to be very consistent. I even wrote a utility program called hexcmp
to look at things that correlates with the way hexedit works (got tired of doing
the base 10 to base 16 conversion all of the time for cmp). I wished I had not
emulated cmp completely though. Instead of offsets it would be nice to just
specify a size (same for both files).
I think I have given you everything you need. I don't have time to wander down
through grub-install to find the problem. Usually people migrate their Windows
2K, XP, etc. to a new hard drive and I install Red Hat on their old drive after
blowing a meg of zeros on to the start of it. If the machine doesn't sport a
chip in the GHz range (or fairly close to it) I may make a primary /boot
partition to be sure of no problems, but usually I make just a /, a swap (both
of those primary on IDE) and then a /home and a smallish FAT32 partition for
passing files between the two operating systems (both part of an extended
partition). It has never been a problem of not having a /boot. It is always a
bad stage2 GRUB boot file.
What's written to the hard disk is not supposed to be exactly the same
as what's in /usr/share/grub -- minor changes are made during the
embedding process to account for various geometry and other settings
changes. Also, I have no problem doing this here.
Does it work if you let the installer set up the boot loader instead
of doing it by hand?
First off, I KNOW what is written into the stage2 file is not the
vanilla version down in /usr/share/grub/i386-redhat. I wandered
through the grub-install script enough to learn that. I just didn't
want to take any more time on it. I am assuming you are now getting a
lot of people who are real novices with Linux and Unix but want to
look at it as a possible replacement for Windows machines with all of
the viruses and worms that are hitting the Windows world. Microsoft
down more than after September 11th 2001? They are using Linux cache
servers in front of their W2K web servers to handle the update load?
Sheesh! I will bet there are some unhappy people in Redmond right now.
This is going to be long winded. First, don't ask me to put on GRUB
at OS install time. The dual-boot Windows 98/2000 systems my friend
keeps churning out every 3-5 months as he upgrades and the people that
get them from him ask me to put Linux on for them are some of the
machines I have installed Linux on and doing it that way toasts them!
It is tricky and I have to install GRUB AFTER I install the OS with
judicious use of hexedit to make things work. Second, I don't even
have the proverbial fdisk to do a fdisk /mbr if I remove Linux later
on for some people. What I do is dd the original few Kilobytes off of
/dev/hda to put back to boot to only Windows (or Windows 98/2000 for
those unlucky people buying my friend's older systems) again later on.
For the Windows 98/2000 machines, the fdisk /mbr won't work anyway. It
toasts them. You can boot to Linux with your boot floppies but that is
it. They have different boot loader code in the first 512 bytes than a
normal Windows machine. The way I have to set them up is have GRUB
boot either Linux or load the Windows boot loader/chooser that picks
between 98 & 2000. If you like mucking around, go ahead and do it.
I am getting tired of it. I won't install Linux for them any more if
they have one of those stupid combo 98/2000 jobs. What a pain in the
arse! I don't care if they use just Windows 2000, Windows XP, or
Windows 2003 (they can take their pick - I will work with any one of
them individually), I will install Linux for them. No more double
Windows OS machines for me!
The size of the stage2 file you get at OS load time is a bigger
than the one you get with grub-install. Here is the size of the
various stage2 files I got when I installed GRUB:
130340 stage2 # this one is on my machine
And since we are not battling crackers, their checksums via sum:
59415 104 fpstage2.gandalf
59415 104 fpstage2.oscar
51654 128 stage2 # my machine - gandalf
05584 104 stage2.GRUB_install.BAD
61612 128 stage2.iggy # iggy has a /boot and
# I hand edited the file
# He also has Windows 98
# and 2000 with RH 9.0
The two "fpstag*" files were generated by grub-install on floppies on
two machines that have the same partitioning (/, /home, SWAP). I only
make a /boot partition (which must be a primary to be bootable) on
older systems where I have my doubts about how far into the drive I
can go. I boot FreeBSD so far past the 1024th cylinder on my own
machine it is pathetic. Try cylinder 3000+!
The stage2 file is the one that is on my machine that was generated by
grub at install time. The stage2.GRUB-install.bad file was generated
by a grub-install to the hard drive AFTER the OS was installed on a
machine named oscar (sitting next to gandalf right now). It matches
the size of the floppy's stage2 files but is slightly different as
this comparision shows:
[hhhobbit@gandalf grubby]$ hexcmp -l fpstage2.oscar \
1f0 f8 0
1f4 b8 0 # locations generated to match hexedit
1f7 b 0 # the hexedit program was written by me
1f8 df 2
1fc 17 cf
22b 20 0
Even though the bad stage2 file generated for the hard drive is the
same size as the one on the floppies, if you replace the stage2 files
on the floppies with the bad stage2 file on the hard drive, GRUB will
not load! It hangs! How oscar strangely booted from the hard drive I
will never know. I didn't try it twice. I made a copy of oscar's
stage2 file (stage2.GRUB_install.BAD), and replaced it with my stage2
file. oscar has been loading fine since then.
What I have been doing is just giving people my stage2 file and
calling it a day. Most of the machines I work with now can boot past
the 1024 cylinder boundary so I normally don't make that piddly little
/boot any more which MUST be a primary partition. Somebody should be
writing something some place that says the boot area MUST be a primary
partition. I have yet to get a logical to boot. But I have never read
that tidbit of wisdom any place. I learned it the hard way!
If I have a /boot partition my grub install command is (assuming
Windows is on the first drive, Linux being put either on the first or
the second drive):
[root@XXX /]# grub-install --force-lba --root-directory=/boot '(hd0)'
and if I have /boot as part of /:
[root@XXX /]# grub-install --force-lba '(hd0)'
I can't make a copy of my first CD but I assume that has nothing to do
with this problem. I just thought you should know that. Just in case
that makes any difference NOW I will give you the MD5 checksum of the
generic stage2 file down in /usr/share/grub/i386-redhat:
[hhhobbit@gandalf tmp]$ cd /usr/share/grub/i386-redhat
[hhhobbit@gandalf i386-redhat]$ md5sum stage1
[hhhobbit@gandalf i386-redhat]$ md5sum stage2
I am not going to bother with a check sum of grub-install. If it runs,
it is probably fine, and it does generate bootable stage2 files for
I wrote the hexcmp program, not hexedit.
I don't know whether this is in any way related but today my
machine could no longer GRUB-boot from its first harddisk (I have
a dual IDE for software mirror RAID). Using a GRUB boot floppy I
empirically find that some file in /boot can no longer be 'found'
by Grub on (hd0) though they show up if you 'tab' after the find
command. Unfortunately, one of these files is 'stage1'. D'oh.
'stage1' is 'find'-ed without problem on (hd1). There goes my
Saturday night. Could there be a problem with the 1.5 stage part?
I will open a separate bug report if necessary....
Red Hat apologizes that these issues have not been resolved yet. We do want to
make sure that no important bugs slip through the cracks.
Red Hat Linux 7.3 and Red Hat Linux 9 are no longer supported by Red Hat, Inc.
They are maintained by the Fedora Legacy project (http://www.fedoralegacy.org/)
for security updates only. If this is a security issue, please reassign to the
'Fedora Legacy' product in bugzilla. Please note that Legacy security update
support for these products will stop on December 31st, 2006.
If this is not a security issue, please check if this issue is still present
in a current Fedora Core release. If so, please change the product and version
to match, and check the box indicating that the requested information has been
If you are currently still running Red Hat Linux 7.3 or 9, please note that
Fedora Legacy security update support for these products will stop on December
31st, 2006. You are strongly advised to upgrade to a current Fedora Core release
or Red Hat Enterprise Linux or comparable. Some information on which option may
be right for you is available at http://www.redhat.com/rhel/migrate/redhatlinux/.
Any bug still open against Red Hat Linux 7.3 or 9 at the end of 2006 will be
closed 'CANTFIX'. Again, if this bug still exists in a current release, or is a
security issue, please change the product as necessary. We thank you for your
help, and apologize again that we haven't handled these issues to this point.
Red Hat Linux 7.3 and Red Hat Linux 9 are no longer supported by Red Hat, Inc.
f you are currently still running Red Hat Linux 7.3 or 9, you are strongly
advised to upgrade to a current Fedora Core release or Red Hat Enterprise Linux
or comparable. Some information on which option may be right for you is
available at http://www.redhat.com/rhel/migrate/redhatlinux/.
Closing as CANTFIX.