Red Hat Bugzilla – Bug 142737
lvm2-related boot failure
Last modified: 2007-11-30 17:10:56 EST
Description of problem:
System won't boot
Version-Release number of selected component (if applicable):
Probably difficult, but easy on my machine. :)
Steps to Reproduce:
1. Shut down FC3 without sync'ing disks
2. Try to boot
3. It doesn't.
System won't boot
System should boot.
I have an FC3 system, that was happy, but is now unhappy. This may be
related to someone, who shall remain nameless, having shut off the power
on it without doing an orderly shutdown. Then again, maybe it was
because of a "yum -y update", because I put off rebooting for a while
Anyway, now when it tries to boot, I see:
Red Hat nash version 4.1.18 starting
Reading all physical volumes. This may take a while...
Found volume group "VolGroup00" using metadata type lvm2
2 logical volume(s) in volume group "VolGroup00" now active
...and that's it. I've left it there for over and hour, and it never
gets past that.
I booted off of an FC3 rescue cd, and found that I could mount the /boot
partition, but I cannot mount the / partition. I ran various lvm
commands that identified two lvm volumes on the system.
fsck'ing /dev/hda2 (which is /) is getting me no where though - it just
says "invalid argument".
I tried firing up device mapper and udev in order to get
a /dev/VolGroup00 directory, but it just wouldn't do it - at least, not
with the things I tried. I could mkdir the directory, but then "lvm
vgmknodes" would remove it.
What do I need to do to get past this? There's stuff in the filesystem
I want quite a bit. :-S
I tried all 3 FC3 kernels I have on the system, but none would come up,
getting stuck at that same point.
When I boot up into
the rescue CD and let it try to find my fedora install, it gets really
confused. More specifically, it says:
Searching for Fedora Core installations...
0% install exited abnormally -- received signal 15
kernel panic - not syncing: Out of
memory and no killable processes
If I remove "quiet" and add "single" to my boot options, I get:
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: dm-0: orphan cleanup on readonly fs
...and there it hangs.
Also, I ran memtest86 on the box for a while (a little over an hour),
and found no errors.
What finally fixed it was:
On FC3's rescue disk, what I actually did was:
1) Do startup network interfaces
2) Don't try to automatically mount the filesystems - not even readonly
3) lvm vgchange --ignorelockingfailure -P -a y
4) fdisk -l, and guess which partition is which based on size: the
small one was /boot, and the large one was /
5) mkdir /mnt/boot
6) mount /dev/hda1 /mnt/boot
7) Look up the device node for the root filesystem in
8) A first tentative step, to see if things are working: fsck -n
9) Dive in: fsck -f -y /dev/VolGroup00/LogVol00
10) Wait a while... Be patient. Don't interrupt it
So you now think fsck was hanging?
I can't do much more than guess, since there's no strace on the FC3
recovery cd image.
However, the fact that LVM2 came right up using the steps above, and
the problem was corrected by an fsck, does seem to suggest an ext3
It's worth noting that LVM2 obfuscated the filesystem in such a way
that the usual ext2 recovery tools were confused.
> fsck'ing /dev/hda2 (which is /) is getting me no where though
As you discovered, you needed to run fsck on the logical volume not
the raw device. This probably needs documenting somewhere - but I'm
not sure where.
I've tried various things with recent CD images and I can't reproduce
this problem: the automatic recovery/rescue mode works fine for me, so
I'm going to assume the cause of this has since been fixed.