Bug 1016750
Summary: | Server would no longer boot, dracut can not find volume groups | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michael Mussulis <michael> | ||||||
Component: | lvm2 | Assignee: | Peter Rajnoha <prajnoha> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 18 | CC: | agk, bmarzins, bmr, dracut-maint, dwysocha, harald, heinzm, husung, jonathan, jtt77777, lvm-team, michael, msnitzer, prajnoha, prockai, zkabelac | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-02-05 23:14:43 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
The boot problem, described by Michael Mussulis, still exists for me with kernel-3.10.14-100.fc18.i686, too. Kernel version 3.10.12-100.fc18.i686 was the last one, which booted without any problems. The boot problem occurs only on my systems with (software) raids. sosreport.txt: : [ 1.838322] localhost kernel: scsi2 : ioc0: LSI53C1030 B2, FwRev=01032571h, Ports=1, MaxQ=222, IRQ=18 [ 3.184348] localhost kernel: scsi 2:0:0:0: Direct-Access COMPAQ BD07285A25 HPB4 PQ: 0 ANSI: 3 [ 3.184367] localhost kernel: scsi target2:0:0: Beginning Domain Validation [ 3.198503] localhost kernel: scsi target2:0:0: Ending Domain Validation [ 3.198568] localhost kernel: scsi target2:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) [ 3.202057] localhost kernel: scsi 2:0:1:0: Direct-Access COMPAQ BD07286224 HPB6 PQ: 0 ANSI: 3 [ 3.202070] localhost kernel: scsi target2:0:1: Beginning Domain Validation [ 3.224502] localhost kernel: scsi target2:0:1: Ending Domain Validation [ 3.224565] localhost kernel: scsi target2:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI PCOMP (6.25 ns, offset 127) [ 4.476446] localhost kernel: scsi 2:0:8:0: Processor SDR GEM318 0 PQ: 0 ANSI: 2 [ 4.476461] localhost kernel: scsi target2:0:8: Beginning Domain Validation [ 4.477711] localhost kernel: scsi target2:0:8: Ending Domain Validation [ 4.477775] localhost kernel: scsi target2:0:8: asynchronous [ 6.230915] localhost kernel: sd 2:0:0:0: Attached scsi generic sg1 type 0 [ 6.231311] localhost kernel: sd 2:0:0:0: [sda] 142264000 512-byte logical blocks: (72.8 GB/67.8 GiB) [ 6.231659] localhost kernel: sd 2:0:1:0: Attached scsi generic sg2 type 0 [ 6.232164] localhost kernel: sd 2:0:1:0: [sdb] 142264000 512-byte logical blocks: (72.8 GB/67.8 GiB) [ 6.232603] localhost kernel: sd 2:0:0:0: [sda] Write Protect is off [ 6.232612] localhost kernel: sd 2:0:0:0: [sda] Mode Sense: d3 00 10 08 [ 6.232636] localhost kernel: scsi 2:0:8:0: Attached scsi generic sg3 type 3 [ 6.234148] localhost kernel: sd 2:0:1:0: [sdb] Write Protect is off [ 6.234155] localhost kernel: sd 2:0:1:0: [sdb] Mode Sense: cf 00 10 08 [ 6.234206] localhost kernel: sd 2:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 6.235430] localhost kernel: sd 2:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 6.244123] localhost kernel: sdb: sdb1 sdb2 sdb3 [ 6.247466] localhost kernel: sda: sda1 sda2 sda3 [ 6.248936] localhost kernel: sd 2:0:1:0: [sdb] Attached SCSI disk [ 6.251679] localhost kernel: sd 2:0:0:0: [sda] Attached SCSI disk [ 6.481820] localhost kernel: md: bind<sda1> [ 6.559244] localhost kernel: md: bind<sdb3> [ 6.573242] localhost kernel: md: bind<sdb2> [ 6.577967] localhost kernel: md: bind<sda3> [ 6.586118] localhost kernel: md: raid1 personality registered for level 1 [ 6.586823] localhost kernel: md/raid1:md126: active with 2 out of 2 mirrors [ 6.586862] localhost kernel: md126: detected capacity change from 0 to 67134619648 [ 6.589022] localhost kernel: RAID1 conf printout: [ 6.589031] localhost kernel: --- wd:2 rd:2 [ 6.589036] localhost kernel: disk 0, wo:0, o:1, dev:sdb3 [ 6.589041] localhost kernel: disk 1, wo:0, o:1, dev:sda3 [ 6.589470] localhost kernel: md126: unknown partition table [ 6.599233] localhost kernel: md: bind<sdb1> [ 6.602384] localhost kernel: md/raid1:md127: active with 2 out of 2 mirrors [ 6.602423] localhost kernel: md127: detected capacity change from 0 to 592117760 [ 6.602611] localhost kernel: RAID1 conf printout: [ 6.602617] localhost kernel: --- wd:2 rd:2 [ 6.602623] localhost kernel: disk 0, wo:0, o:1, dev:sdb1 [ 6.602627] localhost kernel: disk 1, wo:0, o:1, dev:sda1 [ 6.608480] localhost kernel: md: bind<sda2> [ 6.611333] localhost kernel: md/raid1:md125: active with 2 out of 2 mirrors [ 6.611374] localhost kernel: md125: detected capacity change from 0 to 2212495360 [ 6.611554] localhost kernel: RAID1 conf printout: [ 6.611560] localhost kernel: --- wd:2 rd:2 [ 6.611565] localhost kernel: disk 0, wo:0, o:1, dev:sdb2 [ 6.611570] localhost kernel: disk 1, wo:0, o:1, dev:sda2 [ 6.617519] localhost kernel: md127: unknown partition table [ 6.620090] localhost kernel: md125: unknown partition table [ 7.118725] localhost systemd[1]: Started Show Plymouth Boot Screen. [ 7.119897] localhost systemd[1]: Started Dispatch Password Requests to Console Directory Watch. [ 7.121080] localhost systemd[1]: Starting Paths. [ 7.121881] localhost systemd[1]: Reached target Paths. [ 7.122578] localhost systemd[1]: Starting Forward Password Requests to Plymouth Directory Watch. [ 7.123255] localhost systemd[1]: Started Forward Password Requests to Plymouth Directory Watch. [ 7.123952] localhost systemd[1]: Starting Basic System. [ 7.124701] localhost systemd[1]: Reached target Basic System. [ 192.250278] localhost dracut-initqueue[152]: Warning: Could not boot. [ 192.256402] localhost dracut-initqueue[152]: Warning: /dev/md1 does not exist [ 192.263734] localhost systemd[1]: Starting Setup Virtual Console... [ 192.269814] localhost systemd[1]: Started Setup Virtual Console. [ 192.271125] localhost systemd[1]: Starting Dracut Emergency Shell... - end of file - Any ideas? In my case /etc/mdadm.conf was missing in the newly created initial ramdisk. With a fixed initial ramdisk (only mdadm.conf copied in) kernel 3.10.14-100.fc18.i686 now boots fine on my systems. (In reply to Michael Mussulis from comment #0) > Created attachment 809371 [details] > System report > > Description of problem: After a power cut, the server would not boot > anymore. After "Reached target System Initialization" it would not display > anything for a while, then throws a message about not being able to boot, > drops into dracut shell and advises to send you the sosreport.txt. We > initially assumed the disks became corrupt but eventually after many tries, > we discovered it was the latest kernel at fault. Several days ago I > installed kernel-3.10.13-101.fc18.x86_64 but never rebooted. Today, the > power loss forced us to reboot the server and this problems showed up. > Booting with kernel-3.10.12-100.fc18.x86_64 works just fine. > > > Version-Release number of selected component (if applicable): > > > How reproducible: > > > Steps to Reproduce: > 1. Update with yum to latest kernel > 2. Make sure you have /dev/root as an LVM in raid (hardware) > 3. > > Actual results: > [ 123.784336] localhost dracut-initqueue[163]: Scanning devices for LVM > logical volumes fedora_xvdev/swap fedora_xvdev/root > [ 123.787181] localhost dracut-initqueue[163]: No volume groups found > [ 123.789270] localhost dracut-initqueue[163]: PARTIAL MODE. Incomplete > logical volumes will be processed. > [ 123.790497] localhost dracut-initqueue[163]: Volume group "fedora_xvdev" > not found > [ 123.790669] localhost dracut-initqueue[163]: Skipping volume group > fedora_xvdev > [ 183.950324] localhost dracut-initqueue[163]: Warning: Could not boot. > [ 183.965223] localhost dracut-initqueue[163]: Warning: > /dev/fedora_xvdev/root does not exist > [ 183.965476] localhost dracut-initqueue[163]: Warning: > /dev/fedora_xvdev/swap does not exist > [ 183.965695] localhost dracut-initqueue[163]: Warning: > /dev/mapper/fedora_xvdev-root does not exist > > Expected results: > Boot as normal. > > Additional info: > It would seem the latest kernel has a problem with /dev/root + LVM + RAID. > We are using an HP SmartArray P400 with 4 SAS 146Gb drives in RAID 1. See > attached sosreport.txt. In raid 1? On your kernel command line, you have turned _off_ raid with "rd.md=0 rd.dm=0" BOOT_IMAGE=/vmlinuz-3.10.13-101.fc18.x86_64 root=/dev/mapper/fedora_xvdev-root ro rd.lvm.lv=fedora_xvdev/swap rd.md=0 rd.dm=0 rd.lvm.lv=fedora_xvdev/root rd.luks=0 vconsole.keymap=us rhgb quiet biosdevname=0 LANG=en_US.UTF-8 blkid only sees these partitions: blkid /dev/cciss/c0d0: PTTYPE="dos" /dev/cciss/c0d0p1: UUID="b7722444-956b-4b7f-96c1-5664e756913b" TYPE="ext4" /dev/cciss/c0d0p2: UUID="ly36me-EIYq-AYPx-lee0-Tetd-6MtF-xeWRCm" TYPE="LVM2_member" /dev/cciss/c0d1: UUID="47a37751-7231-470f-b609-84573b53c4aa" TYPE="ext4" and only /dev/cciss/c0d0p2 has an LVM member. + lvm pvdisplay --- Physical volume --- PV Name /dev/cciss/c0d0p2 VG Name fedora_xvdev + lvm lvdisplay --- Logical volume --- LV Path /dev/fedora_xvdev/root LV Name root VG Name fedora_xvdev LV UUID qE45lq-QF5C-VLfs-lwfu-AIq5-sdZo-TB7DMZ LV Write Access read/write LV Creation host, time localhost, 2013-03-07 12:23:51 +0000 LV Status NOT available LV Size 50.00 GiB Current LE 12800 Segments 1 Allocation inherit Read ahead sectors auto LV Status: NOT available... Does it work, if you replace: "rd.lvm.lv=fedora_xvdev/swap rd.lvm.lv=fedora_xvdev/root " with "rd.lvm.vg=fedora_xvdev" ?? Hi Harald, I will schedule some tests a little later on when the server will be less busy. Thanks, Michael. Michael, are you still hitting this problem? Hi, Sorry, we've been so busy we've really not had time to test with the above suggestions. I've purchased a proper server, an ML370 G5 tower, and have migrated everything across, so I will be able to do the tests without affecting our day-to-day operations. I hope to have a few minutes tomorrow to look at this and report back. Cheers, Michael. Hi All, I've also run into this problem on an older hp dl360 g5 with the embedded p400i controller. I tried the suggestion above of changing the rd.lv.lv to the volume group only rd.lv.vg with no success. I noticed in comparing the output of lsinitrd with the previous initramfs* that /etc/lvm/lvm.conf wasn't included in the 3.11.10, so I tried regenerating the initramfs with dracut --force --lvmconf and it now finds the logical volumes on boot. -John (In reply to John Taylor from comment #9) > Hi All, > > I've also run into this problem on an older hp dl360 g5 with the embedded > p400i controller. > > I tried the suggestion above of changing the rd.lv.lv to the volume group > only rd.lv.vg with no success. I noticed in comparing the output of > lsinitrd with the previous initramfs* that /etc/lvm/lvm.conf wasn't included > in the 3.11.10, so I tried regenerating the initramfs with > dracut --force --lvmconf > > and it now finds the logical volumes on boot. > Can you attach your lvm.conf here? I'd like to see what's the difference from defaults. Thanks. Created attachment 838981 [details]
lvm.conf
This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |
Created attachment 809371 [details] System report Description of problem: After a power cut, the server would not boot anymore. After "Reached target System Initialization" it would not display anything for a while, then throws a message about not being able to boot, drops into dracut shell and advises to send you the sosreport.txt. We initially assumed the disks became corrupt but eventually after many tries, we discovered it was the latest kernel at fault. Several days ago I installed kernel-3.10.13-101.fc18.x86_64 but never rebooted. Today, the power loss forced us to reboot the server and this problems showed up. Booting with kernel-3.10.12-100.fc18.x86_64 works just fine. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Update with yum to latest kernel 2. Make sure you have /dev/root as an LVM in raid (hardware) 3. Actual results: [ 123.784336] localhost dracut-initqueue[163]: Scanning devices for LVM logical volumes fedora_xvdev/swap fedora_xvdev/root [ 123.787181] localhost dracut-initqueue[163]: No volume groups found [ 123.789270] localhost dracut-initqueue[163]: PARTIAL MODE. Incomplete logical volumes will be processed. [ 123.790497] localhost dracut-initqueue[163]: Volume group "fedora_xvdev" not found [ 123.790669] localhost dracut-initqueue[163]: Skipping volume group fedora_xvdev [ 183.950324] localhost dracut-initqueue[163]: Warning: Could not boot. [ 183.965223] localhost dracut-initqueue[163]: Warning: /dev/fedora_xvdev/root does not exist [ 183.965476] localhost dracut-initqueue[163]: Warning: /dev/fedora_xvdev/swap does not exist [ 183.965695] localhost dracut-initqueue[163]: Warning: /dev/mapper/fedora_xvdev-root does not exist Expected results: Boot as normal. Additional info: It would seem the latest kernel has a problem with /dev/root + LVM + RAID. We are using an HP SmartArray P400 with 4 SAS 146Gb drives in RAID 1. See attached sosreport.txt.