Bug 998709 - No-boot regression F18->F19
No-boot regression F18->F19
Status: CLOSED EOL
Product: Fedora
Classification: Fedora
Component: dracut (Show other bugs)
19
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: dracut-maint
Fedora Extras Quality Assurance
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-19 16:22 EDT by Jan Kratochvil
Modified: 2015-02-18 09:05 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-02-18 09:05:32 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sosreport.txt.gz (143.84 KB, application/octet-stream)
2013-08-20 03:35 EDT, Jan Kratochvil
no flags Details
debug.txt.gz (312.61 KB, application/octet-stream)
2013-08-20 04:59 EDT, Jan Kratochvil
no flags Details
sosreport2.txt.gz with swap mountpoint swap (it was none before) (143.42 KB, application/octet-stream)
2013-08-20 07:53 EDT, Jan Kratochvil
no flags Details
debug2.txt.gz (312.38 KB, application/octet-stream)
2013-08-20 08:34 EDT, Jan Kratochvil
no flags Details

  None (edit)
Description Jan Kratochvil 2013-08-19 16:22:48 EDT
Description of problem:
I cannot boot my server with .fc19 kernel built with F-19 dracut.
Using:
/boot = RAID1 on /dev/sd{a,b,c}, not encrypted
/     = RAID5 on /dev/sd{a,b,c}, LUKS encrypted
I cannot supply a reproducer with qcow2 images as Anaconda just locks up trying to setup such a simple disk setup.  (I haven't filed it for Anaconda.)

Version-Release number of selected component (if applicable):
PASS:   dracut-024-25.git20130205.fc18.x86_64
PASS:   dracut-025-1.fc19.x86_64
noboot: dracut-026-1.fc19.x86_64
 - Kernel locks up before it prompts for LUKS password.
PASS:   dracut-026-54.git20130316.fc19.x86_64
PASS:   dracut-026-56.git20130318.fc20.x86_64
FAIL:   dracut-026-62.git20130319.fc19.x86_64
FAIL:   dracut-026-72.git20130320.fc19.x86_64
FAIL:   dracut-027-1.fc19.x86_64
FAIL:   dracut-031-29.git20130812.fc20.x86_64

How reproducible:
Always.

Steps to Reproduce:
rpm -U --oldpackage --nodeps dracut-026-62.git20130319.fc19.x86_64.rpm
dracut -f --kver 3.10.7-200.fc19.x86_64
sync;hdparm -f /dev/sda;hdparm -f /dev/sdb;hdparm -f /dev/sdc;qemu-kvm -snapshot -hda /dev/sda -hdb /dev/sdb -hdc /dev/sdc -serial stdio -net none -m 1024

Actual results:
No boot in KVM.

Expected results:
Boot in KVM.

Additional info:
PASS:   dracut-026-56.git20130318.fc20.x86_64
->
FAIL:   dracut-026-62.git20130319.fc19.x86_64
=
* Tue Mar 19 2013 Harald Hoyer <harald@redhat.com> 026-62.git20130319
- fix dracut service ordering
Resolves: rhbz#922991
* Mon Mar 18 2013 Harald Hoyer <harald@redhat.com> 026-56.git20130318

Please enter passphrase for disk host1root!:****
[    6.967881] bio: create slab <bio-1> at 1
[    7.088740] bio: create slab <bio-1> at 1
[  OK  ] Found device /dev/mapper/host1root.
[  OK  ] Found device /dev/disk/by-uuid/393b7ea1-33fa-4d85-a05b-9ade4f2b8798.
[  OK  ] Started Cryptography Setup for host1root.

<long hang - this is the bug, boot fails here>

[   90.689386] systemd[1]: Job dev-mapper-luks\x2d28e0d1d7\x2dc2b6\x2d4c67\x2d8b32\x2d2efa2824011c.device/start timed out.
[   90.693896] systemd[1]: Starting Emergency Shell...
[   90.697178] systemd[1]: Stopped dracut initqueue hook.
[   90.699108] systemd-journald[73]: Received SIGTERM
[   90.699533] systemd[1]: Stopped Forward Password Requests to Plymouth.
[   90.699640] systemd[1]: Stopping Forward Password Requests to Plymouth Directory Watch.
[   90.699686] systemd[1]: Stopped Forward Password Requests to Plymouth Directory Watch.
[   90.699699] systemd[1]: Stopping Show Plymouth Boot Screen...
[   90.699787] systemd[1]: Stopped Show Plymouth Boot Screen.
[   90.699874] systemd[1]: Stopping udev Coldplug all Devices...
[   90.699965] systemd[1]: Stopped udev Coldplug all Devices.
[   90.700043] systemd[1]: Stopping dracut pre-trigger hook...
[   90.700094] systemd[1]: Stopped dracut pre-trigger hook.
[...]
:/# ls -l /dev/mapper
total 0
crw------- 1 root 0 10, 236 Aug 19 19:55 control
lrwxrwxrwx 1 root 0       7 Aug 19 19:55 host1root -> ../dm-0
:/# mknod /dev/dm-0 b 253 0
:/# cat /dev/mapper/host1root
cat: /dev/mapper/host1root: No such device or address


menuentry 'Fedora (3.10.7-200.fc19.x86_64) 19 (Schrödinger’s Cat)' --class fedora --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-393b7ea1-33fa-4d85-a05b-9ade4f2b8798' {
        savedefault
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_gpt
        insmod part_gpt
        insmod part_gpt
        insmod diskfilter
        insmod mdraid1x
        insmod ext2
        set root='mduuid/fe2b3fb7c9a3c37c527dbaa9d3314202'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='mduuid/fe2b3fb7c9a3c37c527dbaa9d3314202'  e1a0220b-ce08-469a-8ea6-a9ded0a501cb
        else
          search --no-floppy --fs-uuid --set=root e1a0220b-ce08-469a-8ea6-a9ded0a501cb
        fi
        echo 'Loading Fedora (3.10.7-200.fc19.x86_64) 19 (Schrödinger’s Cat)'
        linux   /vmlinuz-3.10.7-200.fc19.x86_64 root=UUID=393b7ea1-33fa-4d85-a05b-9ade4f2b8798 ro rd.luks.uuid=luks-28e0d1d7-c2b6-4c67-8b32-2efa2824011c rd.luks.uuid=luks-0fc6128b-b90f-477a-825a-e51da8919949 systemd.log_level=debug LANG=en_US.UTF-8
        echo 'Loading initial ramdisk ...'
        initrd /initramfs-3.10.7-200.fc19.x86_64.img
}


md125 : active raid1 sdc2[3] sdb2[1] sda2[2]
      1023988 blocks super 1.2 [3/3] [UUU]
 = host1boot = /boot

md126 : active raid5 sdc4[3] sdb4[1] sda4[2]
      1937113088 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
 = host1root = /

md127 : active (auto-read-only) raid5 sdc3[3] sdb3[1] sda3[2]
      14286848 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
 = host1swap

28e0d1d7-c2b6-4c67-8b32-2efa2824011c = cryptsetup luksUUID /dev/md/host1root_0
0fc6128b-b90f-477a-825a-e51da8919949 = cryptsetup luksUUID /dev/md/host1swap_0
e1a0220b-ce08-469a-8ea6-a9ded0a501cb = tune2fs -l /dev/md/host1boot_0
393b7ea1-33fa-4d85-a05b-9ade4f2b8798 = tune2fs -l /dev/mapper/host1root
Comment 1 Jan Kratochvil 2013-08-19 16:34:05 EDT
The test via KVM was somehow wrong.  With iron the real results are:

PASS: dracut-024-25.git20130205.fc18.x86_64 (on F-19 system)
FAIL: dracut-026-56.git20130318.fc20.x86_64

Testing cycle on the iron is a bit difficult here so not giving more exact regression point.
Comment 2 Harald Hoyer 2013-08-20 02:53:29 EDT
Btw, F19 version of dracut is dracut-029-2.

http://download.fedoraproject.org/pub/fedora/linux/updates/19/x86_64/dracut-029-2.fc19.x86_64.rpm

Most helpful would be booting with "rd.debug systemd.log_level=debug" on the kernel command line and attaching /run/initramfs/sosreport.txt.

Just mount a USB stick or /boot in the dracut emergency shell and copy /run/initramfs/sosreport.txt to it.

For those versions of dracut, which don't create /run/initramfs/sosreport.txt,
do:

# cat /proc/mdstat >> /run/initramfs/sosreport.txt
# cat /etc/crypttab >> /run/initramfs/sosreport.txt
# dmsetup ls --tree >> /run/initramfs/sosreport.txt
# journalctl -ab -o short-monotonic >> /run/initramfs/sosreport.txt
Comment 3 Jan Kratochvil 2013-08-20 03:35:30 EDT
Created attachment 788344 [details]
sosreport.txt.gz

This is with the official F-19 version: dracut-029-2.fc19.x86_64

BTW one cannot use USB disk there as there is missing the USB disks kernel module in default initramfs modules set.
Comment 4 Harald Hoyer 2013-08-20 04:15:22 EDT
Please add "rd.md.uuid=21be7f70-532d-d32c-1321-24ad41de0132" to your kernel command line as a workaround, until I find the bug.

Please attach the output "debug.txt" with dracut-029-2.fc19.x86_64:

$ su -
# dracut --debug test.img &>debug.txt
# rm test.img
Comment 5 Jan Kratochvil 2013-08-20 04:59:49 EDT
Created attachment 788386 [details]
debug.txt.gz

I did not test the "rd.md.uuid" workaround (I have older initramfs images).
Comment 6 Harald Hoyer 2013-08-20 05:55:03 EDT
can you attach your /etc/fstab?
Comment 7 Harald Hoyer 2013-08-20 06:03:28 EDT
Or just change the mountpoint of your swap from "none" to "swap".

Then after regenerating your initramfs, it should be able to boot.

Meanwhile, I relaxed the check in dracut to also accept "none" as a mountpoint.
Comment 8 Jan Kratochvil 2013-08-20 07:53:12 EDT
Created attachment 788459 [details]
sosreport2.txt.gz with swap mountpoint swap (it was none before)

(In reply to Harald Hoyer from comment #7)
> Or just change the mountpoint of your swap from "none" to "swap".

It did not help, sosreport2.txt attached.

/etc/fstab:

UUID=393b7ea1-33fa-4d85-a05b-9ade4f2b8798 /       ext4 defaults 1 1
LABEL=host1swap                           swap    swap defaults 0 0
/dev/md/host1boot_0                       /boot   ext4 defaults 1 2
/dev/mapper/host1unsafe                   /unsafe ext4 defaults 1 2
Comment 9 Harald Hoyer 2013-08-20 08:28:30 EDT
The "--debug" output would help more, than the sosreport.

Thanks for testing.
Comment 10 Jan Kratochvil 2013-08-20 08:34:38 EDT
Created attachment 788464 [details]
debug2.txt.gz
Comment 11 Harald Hoyer 2013-08-20 10:05:35 EDT
Does it fix the issue, if you change line 856 of /usr/bin/dracut from:

            [[ "$_d" == LABEL\=* ]] && _d="/dev/disk/by-label/$_d#LABEL=}"

to

            [[ "$_d" == LABEL\=* ]] && _d="/dev/disk/by-label/${_d#LABEL=}"

?
Comment 12 Jan Kratochvil 2013-08-20 10:20:23 EDT
It still does not boot.

I think it wrote more "Started Cryptography Setup" lines (even for the swap now) but I am not sure.

BTW why the boot can fail if anything happens with swap?  swap issues should be non-fatal.

This style of debugging is too expensive, I will try later to setup similar disk config in KVM if the problem is reproducible there.  I just have to configure the disks by hand there as Anaconda crashes.
Comment 13 Harald Hoyer 2013-08-20 10:23:12 EDT
(In reply to Jan Kratochvil from comment #12)
> It still does not boot.
> 
> I think it wrote more "Started Cryptography Setup" lines (even for the swap
> now) but I am not sure.
> 
> BTW why the boot can fail if anything happens with swap?  swap issues should
> be non-fatal.

well, then remove from the kernel command line:
rd.luks.uuid=luks-0fc6128b-b90f-477a-825a-e51da8919949 

And you are not bothered with your swap partition.
Comment 14 Jan Kratochvil 2013-08-20 11:46:48 EDT
(In reply to Harald Hoyer from comment #13)
> well, then remove from the kernel command line:
> rd.luks.uuid=luks-0fc6128b-b90f-477a-825a-e51da8919949 

Done but it still does not boot.

BTW installed now F-17 with the same setup, upgraded it to F-19, modified configs to look like mine but it still boots, problem is not reproducible in KVM.
Comment 15 Jan Kratochvil 2013-08-20 14:08:12 EDT
I found it prints a message it cannot find host1swap before it fails.
So I can get it working either by
(1) Commenting out the swap line from /etc/fstab.
    Then initramfs does not try to setup the swap device at all.
or
(2) Using grub2 commandline properly mentioning all IDs of the swap device:
    linux   /vmlinuz-3.10.7-200.fc19.x86_64 root=/dev/mapper/luks-28e0d1d7-c2b6-4c67-8b32-2efa2824011c ro rd.lvm=0 rd.dm=0 rd.luks.uuid=luks-28e0d1d7-c2b6-4c67-8b32-2efa2824011c rd.luks.uuid=0fc6128b-b90f-477a-825a-e51da8919949 rd.md.uuid=53a6863e:a17b6de3:9fa6dc60:0cff25a5 rd.md.uuid=21be7f70:532dd32c:132124ad:41de0132 systemd.log_level=debug LANG=en_US.UTF-8

Older dracut therefore probably did not insist on getting the swap device working?  I think if it fails one still may set it up later.
It is true after some boots I had to type "swapon -a" by hand before.

FYI, I do not mind about this Bug anymore but insisting on the swap to boot at all seems needlessly fragile to me.
Comment 16 Harald Hoyer 2013-08-21 04:27:17 EDT
(In reply to Jan Kratochvil from comment #15)
> I found it prints a message it cannot find host1swap before it fails.
> So I can get it working either by
> (1) Commenting out the swap line from /etc/fstab.
>     Then initramfs does not try to setup the swap device at all.
> or
> (2) Using grub2 commandline properly mentioning all IDs of the swap device:
>     linux   /vmlinuz-3.10.7-200.fc19.x86_64
> root=/dev/mapper/luks-28e0d1d7-c2b6-4c67-8b32-2efa2824011c ro rd.lvm=0
> rd.dm=0 rd.luks.uuid=luks-28e0d1d7-c2b6-4c67-8b32-2efa2824011c
> rd.luks.uuid=0fc6128b-b90f-477a-825a-e51da8919949
> rd.md.uuid=53a6863e:a17b6de3:9fa6dc60:0cff25a5
> rd.md.uuid=21be7f70:532dd32c:132124ad:41de0132 systemd.log_level=debug
> LANG=en_US.UTF-8
> 
> Older dracut therefore probably did not insist on getting the swap device
> working?  I think if it fails one still may set it up later.
> It is true after some boots I had to type "swapon -a" by hand before.
> 
> FYI, I do not mind about this Bug anymore but insisting on the swap to boot
> at all seems needlessly fragile to me.

The problem is, that for hibernation and resume to work, we have to wait for the swap devices before we can mount the root filesystem.

In newer versions of dracut, I relaxed the boot procedure to not fail, if the swap device was not found.
Comment 17 Jan Kratochvil 2013-08-21 07:21:09 EDT
Resume from hibernation makes sense, true - just not on this server.

FYI dracut-032-1.fc20.x86_64 still does not boot with my old/broken grub config.
(Unaware if "newer version" you mean post-032-1 or 032-1 itself.)
Comment 19 Jan Kratochvil 2013-08-24 14:36:37 EDT
The dracut-functions.sh part is not easily applicable, I did not hack it more.

I still think this patch will not handle the case older Dracut could cope with:
That I had rd.md.uuid=swap:partition:uuid parameter missing in my config (and rd.md.uuid=root:partition:uuid present).

Hopefully newly installed systems will never face it.
Comment 20 Jan Kratochvil 2013-09-09 04:41:49 EDT
FYI I had now problems to boot my other LUKSed machine (Lenovo X220 notebook).
Something is repeatedly corrupting my (plain, non-md) LUKS swap partition there.

Therefore I removed rd.luks.uuid=XXX for the swap and the machine boots again.
Comment 21 Fedora End Of Life 2015-01-09 17:15:42 EST
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 22 Fedora End Of Life 2015-02-18 09:05:32 EST
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.