| Summary: | kernel-2.6.40-4.fc15.x86_64 fails to boot due to LVM PV on RAID not starting | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Doug Ledford <dledford> |
| Component: | dracut | Assignee: | dracut-maint |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 15 | CC: | agk, dev, dledford, harald, ivor.durham, jonathan, kay, mbroz, michael.wuersch, msmsms10079, rhbugzilla |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 729205 | Environment: | |
| Last Closed: | 2012-01-23 09:26:20 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Doug Ledford
2011-09-07 14:59:55 UTC
After my ill-fated decision to install FC15 last Sunday because of growing flakiness in my FC14 installation, I have been dead in the water (except for repeated attempts to re-build the system only to die after some later update) first with #729205 and now this bug. I re-installed from the Live DVD, with a "Use all" default repartition. Rebooted after the basic installation completed. Executed the yum command to install mdadm-3.2.2-9.fc15, then "yum update kernel*" rather than the full yum update. Then I re-built initramfs as instructed. I got a slew of warnings about potential missing firmware, but I don't know if they are expected or relevant. I rebooted successfully, getting past #729205. Then I did the full "yum update" which also completed successfully. Just to be sure everything was ok before restoring my "stuff", I rebooted again and now it dies with this problem. I confirmed mdadm-3.2.2-9.fc15 was still installed after the full update and before rebooting this last time. It looks like something modified during the final "yum update" may have introduced this problem. dmesg shows the exact sequence of messages as above, assembling md127 then md126 and then getting "No root device "block:/dev/mapper/vg_clowder-lv_root" found", where clowder is the assigned hostname of my system. I don't have a way to capture the console log unless it can be written to a USB flash drive, but if there's any other information which would help get me past this showstopper I'll get it by hook or by crook as quickly as possible. The system is a Dell Dimension E520 with integrated Intel Matrix Storage Manager through which two identical disks were configured for RAID 1. Ivor: can you boot up the live cd (or boot from a backup kernel command), install the dracut package on the live cd, rebuild your initramfs for the kernel that doesn't boot using the dracut off of the live cd, then see if it boots up properly? I'm beginning to suspect that maybe there was a dracut upgrade in that final yum update that might be playing a role here and downgrading to the older dracut and rebuilding your initramfs might work around it. Doug, I am able to boot after re-building initramfs with the Live DVD dracut. Here are the steps I took: 1. Booted from the Live DVD and created directories /tmp/a and /tmp/b 2. Mounted /dev/md126p1 as /tmp/a to get to the /boot partition 3. Mounted /dev/dm-4 as /tmp/b to get to the / partition 4. cd /lib/modules (in the Live system) 5. (cd /tmp/b/lib/modules; tar cf - 2.6.40.4-5.fc15.x86_64) | tar xf - (Without copying the files over I got errors during the boot about being unable to find modules and it crashed with this bug again.) 6. cd /tmp/a 7. dracut initramfs-2.6.40.4-5.fc15.x86_64 2.6.40.4-5.fc15.x86_64 --force The dracut package was already available on the Live DVD system. I got the slew of warnings about not finding firmware ".bin" files here as it built the new initramfs. "ls -l in initramfs..." reports: -rw-r--r--. 1 root root 14932953 Sep 9 2011 initramfs-2.6.40.4-5.fc15.x86_64.img 8. Rebooted successfully! uname -a reports: Linux clowder 2.6.40.4-5.fc15.x86_64 #1 SMP Tue Aug 30 14:38:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux "rpm -qa | fgrep dracut" reports: dracut-009-12.fc15.noarch This is the version from my full "yum update" which was installed between my previous successful boot with the updated mdadm and the crash with this bug. I hope this is the correct sequence of steps you had in mind. It at least got me over a big hurdle! Thank you. Ivor: Thanks very much! Now that you are up and running, you can copy your initramfs to a backup file name (such as initramfs-2.6.40.4-5.fc15.x86_64.img.good) and modify /etc/grub.conf to have a new entry that lists this backup file, then on your running system remake your original initramfs using the dracut on the system, then attempt to reboot to the new initramfs (having the old initramfs as a backup). If it doesn't boot, go back to using the backup initramfs, if it does boot then something odd is happening to cause dracut to fail to build a good initramfs only on upgrade versus all the time. Regardless though, I'm pretty sure this is a dracut issue at this point, so I'm going to reassign the bug. However, I *think* the dracut owner is out traveling at the moment, so I don't know how quickly this will get looked at. I finally got back to this and started off by booting 2.6.40-4.fc15.x86_64 with the initramfs I made as described in the bug from which this one was cloned. I was rather surprised to see the system boot successfully since I hadn't changed anything. I rebooted a few times and noticed that when the Intel firmware reported the RAID volume status as Normal, the system would boot and when it reported the status as Verify, it would not. I'm currently running 2.6.40-5.fc15.x86_64. It needed no modifications to boot with a Normal volume though it seems to have the same trouble with a RAID volume with a Verify status. Rodney Sorry I should've said 2.6.40.4-5.fc15.x86_64 for the version of the kernel I'm now running. Rodney Hmmm, ok, so when the array is in state VERIFY it doesn't boot, when it's in a clean state, it does. Can you boot the machine up with the array in VERIFY state, wait until dracut drops you to a debug shell, then get me the output of mdadm -E /dev/sda (assuming your array is on /dev/sda, if not, then any one of the disks that does make up your array)? I need to see why mdadm is failing to start your array when it's in VERIFY state. I get approximately the following from /sbin/mdadm -E /dev/sda...
/dev/sda:
Magic : Intel Raid ISM Cfg Sig.
Version : 1.1.00
Orig Family : 00000000
Family : 47b4aff4
Generation : 00041d10
Attributes : All supported
UUID : ...:...:...:...
Checksum : 905161a2 correct
MPB Sectors : 2
Disks : 2
RAID Devices : 1
Disk00 Serial : WD-...
State : active
Id : 00000000
Usable Size : ... 250.06 GB)
[Volume_0000]:
UUID : ...:...:...:...
RAID Level : 1 <-- 1
Members : 2 <-- 2
Slots : [UU] <-- [UU]
Failed disk : none
This Slot : 0
Array Size : ... 250.06 GB)
Per Dev Size : ... 250.06 GB)
Sector Offset : 0
Num Stripes : 1907704
Chunk Size : 64 KiB <-- 64 KiB
Migrate State : repair
Map State : normal <-- normal
Checkpoint : 0 (512)
Dirty State : dirty
Disk01 Serial : WD-...
State : active
Id : 00010000
Usable Size : ... 250.06 GB)
Rodney
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. I haven't followed the other clones of the original bug (729205) very carefully, but I noticed that mdadm-3.2.2-10.fc15 was indicated as a fix for one of those other clones, so I tried it out with a rebuilt initramfs. My system will now boot even when the Intel firmware reports the RAID volume status as Verify. Thanks to all who worked on this. Rodney |