Red Hat Bugzilla – Bug 112892
Booting by GRUB fails, possibly problem with ext2 filesystem access
Last modified: 2007-04-18 13:00:57 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5)
Description of problem:
* Machine running RH9.0. Standard PC, it's a Siemens-Fujitsu desktop
machine, nothing fancy.
* Two identical IDE harddisks: hd0 is master on IDE1, hd1 is slave
on IDE2. Harddisks are 76319MB (=74.53GB) Western Digital WDC WD800
JB-00CRA1. (They run hot!)
* If the kernel is running, the two disks form a Linux software mirror
RAID (i.e. I use /dev/md devices)
* No multiboot, no nothing. Just GRUB with software RAID (N.B. I
did not get LILO to work in that configuration otherwise I would
be running LILO)
The first partition is mapped to /boot. An excerpt:
/dev/hda1 + /dev/hdd1 == /dev/md0 mounted on /boot
(primary partitions) 511.844 MB
/dev/hda2 + /dev/hdd2 == /dev/md1 mounted on /
(primary partitions) 2047.99 MB
/dev/hda3 + /dev/hdd3 == /dev/md2 mounted on /var/db
(primary partitions) 6143.98 MB
/dev/hda5 + /dev/hdd5 == /dev/md3 mounted on /usr
The filesystem on all disks is ext3.
Recently the kernel has been upgraded from
2.4.20-8 -> 2.4.20-20.9 through up2date. No problem has been
encountered during the reboot at that time. No further changes
were made to the system prior to the fatal reboot.
I just though I would reboot the server. Big mistake! GRUB is
launched, but stops. The screen just says "GRUB". No GRUB
error messages are displayed. End of story.
It looks like GRUB's stage1 can be loaded and executed off the disk
MBR. Loading stage 1.5 and/or stage 2 seems to fail. As the RAID is
not yet running at boot time, we are in the situation where this is
a 'standard boot' from (hd0), stage 1.5 is the 'e2fs_stage1_5',
and stage2 should go looking for for grub.conf in (hd0,0)/grub
(i.e. /boot/grub). Which does not happen.
I thought something might have messed up stage 1 (i.e. the MBR) and
decided to reinstall from a GRUB boot floppy, created as described
in the GRUB homepage:
This gives me the GRUB commandline. As I wanted to reinstall GRUB,
I needed the file 'stage1'. Now things get weird: GRUB did not
'find' that file.
We switch to the /boot partitions on both mirrored disks, to verify
they are there:
root (hd0,0) --> "ext2fs, type 0xfd"
root (hd1,0) --> "ext2fs, type 0xfd"
We want to find the file 'stage1'
find /grub/stage1 --> "(hd1,0)/grub/stage1"
This is bad: GRUB finds that file only on the second harddisk, not
on the first. The same happens with some other files e.g.
"vmlinuz" "kernel.h" ".module-info". Others can be found on both
harddisks, e.g. "os2_d.b" "chain.b" "boot.b".
GRUB can also 'list' the directories correctly. I remove the second
harddisk, then re-enter the GRUB commandline, then:
lists the file 'vmlinuz' for example, but
will result in 'Error 15: File not found'.
To verify the files are actually there, I mount the disks using the
Linux rescue console (using the RedHat CD1), then look around.
The /boot partitions on both disks do not present any anomalies.
Does GRUB have trouble with the ext3 filesystem? The symptons above
would be explained if it could not properly access it. The bad disk
will stay in storage for a while if someone wants to know more...
As GRUB refuses to boot off hd0, and we in a RAID setup, I decide to
just plug in the second harddisk as the first. Booting then proceeds.
The machine is now running on a dead mirror.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Try to boot from the original 'first' harddisk
Actual Results: See description
Expected Results: See description
As the reporter, I suggest to close this bug, it's too old now. Worse, the
problem *might* be related to the harddisk in use at the time because it had a
*really* bad case of 'badblock' and was exchanged by the manufacturer.
Sorry for letting this stew.
Red Hat apologizes that these issues have not been resolved yet. We do want to
make sure that no important bugs slip through the cracks.
Red Hat Linux 7.3 and Red Hat Linux 9 are no longer supported by Red Hat, Inc.
They are maintained by the Fedora Legacy project (http://www.fedoralegacy.org/)
for security updates only. If this is a security issue, please reassign to the
'Fedora Legacy' product in bugzilla. Please note that Legacy security update
support for these products will stop on December 31st, 2006.
If this is not a security issue, please check if this issue is still present
in a current Fedora Core release. If so, please change the product and version
to match, and check the box indicating that the requested information has been
If you are currently still running Red Hat Linux 7.3 or 9, please note that
Fedora Legacy security update support for these products will stop on December
31st, 2006. You are strongly advised to upgrade to a current Fedora Core release
or Red Hat Enterprise Linux or comparable. Some information on which option may
be right for you is available at http://www.redhat.com/rhel/migrate/redhatlinux/.
Any bug still open against Red Hat Linux 7.3 or 9 at the end of 2006 will be
closed 'CANTFIX'. Again, if this bug still exists in a current release, or is a
security issue, please change the product as necessary. We thank you for your
help, and apologize again that we haven't handled these issues to this point.
Closed as NOTABUG, see note of 2005-11-16 12:00 EST