Red Hat Bugzilla – Bug 501198
boot hangs with with encrypted lvm pv
Last modified: 2010-01-12 11:22:26 EST
Description of problem:
With latest mkinitrd system fails to boot, I get a blank screen and a stuck system where plymouth would usually start.
The difference is due to this incantation which has changed in the init script:
@@ -66,8 +67,10 @@
-echo Setting up disk encryption: /dev/sda2
-plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2 luks-3f98a609-85a7-4c62-8692-5100ad7a9f8e"
+echo Setting up disk encryption: $LUKSUUID
+buildEnv LUKSUUID cryptsetup luksOpen $LUKSUUID luks-3f98a609-85a7-4c62-8692-5100ad7a9f8e
+plymouth ask-for-password --command $LUKSUUID
echo Scanning logical volumes
lvm vgscan --ignorelockingfailure
echo Activating logical volumes
If I edit the initrd and revert that change back to the two line version system boots as normal.
Version-Release number of selected component (if applicable):
Can you show us what /etc/crypttab says?
Created attachment 344449 [details]
Patch to work around empty/malformed crypttab.
Also, can you test this patch to see if it correctly generates a working initrd.
=== /etc/cryptab dated 2008-04-26 ===
luks-sda2 /dev/sda2 none
The machine was installed using crypted pv pretty much as soon as anaconda supported that, back in F9 or maybe F8 timeframe. Would have used manual partitioning in anaconda.
I'll try the patch out shortly.
Confirmed patch in comment #2 is good.
mkinitrd-6.0.85-1.fc11 is still broken for me.
[bruno@cerberus ~]$ more /etc/crypttab
luks-0459b95f-cc7b-4229-8f07-c3012582c726 /dev/md3 none
luks-b93e9fce-0ef0-4d55-a5c8-56deabaf2f61 /dev/md4 none
[bruno@cerberus ~]$ df
Filesystem 1K-blocks Used Available Use% Mounted on
41283648 35398808 5465412 87% /
147153724 91442492 54216232 63% /home
41283648 8196692 32667528 21% /play
/dev/md0 256586 28478 225459 12% /boot
tmpfs 1018580 112 1018468 1% /dev/shm
I confirmed that rebuilding with mkinitrd-6.0.83-1.fc11.x86_64 got the kernel entry I started having problems with (after rebuilding with mkinitrd-6.0.84-1.fc11.x86_64 and then mkinitrd-6.0.85-1.fc11.x86_64) to work again. This suggests that my problem is caused/triggered by the mkinitrd change (and not say by the recent plymouth changes).
Are you sure the broken initrd was built with mkinitrd-6.0.85? Can you compare the working init script with the one generated by mkinitrd-6.0.85 and post the results here?
I believe so, but I want to retest to make sure. I want to do a kernel update tonight and I can combine a retest with -85 at the same time. I'll report back what I find.
I tried it on my home machine and -85 worked. But the format of my /etc/crypttab at home is noticeably different. I'll retest it at work tomorrow. It may be that almost no one will have that format. I went through some intermediate periods of the initial encrypted root setup and some more when plymouth came to be. And adjusting things for changes may have left it in a state that worked, but which is very unlikely for other systems to be in.
Just to compare for my home system:
bash-4.0$ cat /etc/crypttab
luks-585ccbdd-26aa-4d06-ac88-e412c7dc6135 UUID=585ccbdd-26aa-4d06-ac88-e412c7dc6135 none
luks-f022434a-2aef-438a-836d-109e7b4ce931 UUID=f022434a-2aef-438a-836d-109e7b4ce931 none
luks-58aa4879-4d9f-4074-ac4b-173be649c36d UUID=58aa4879-4d9f-4074-ac4b-173be649c36d none
luks-bb224e36-976e-49fc-86a6-ee8c23b0694f UUID=bb224e36-976e-49fc-86a6-ee8c23b0694f none
Filesystem 1K-blocks Used Available Use% Mounted on
41291328 38952824 1919212 96% /
214509116 104551932 99060732 52% /home
41291328 33636412 7235624 83% /otheros
/dev/md4 256586 43302 200036 18% /boot
tmpfs 1031548 172 1031376 1% /dev/shm
Created attachment 344958 [details]
It turns out it isn't as bad as I thought. I was repeatedly offered a password prompt, but because there was a message after the first one and not after the second, I kept going. After entering it four times (once for each encrypted file system, each having the same passphrase) it booted successfully.
In the past I only needed to enter the password once and it would try it on each file system in turn.
I didn't have to do this on my home machine, which has a very similar setup. So I think the format of the cryptsetup file is influencing things. If you want I can try changing the format to see if that rectifies the issue? I can try adding an entry for all 4 file systems and/or using UUID= instead of a path.
I have been seeing memory allocations fail this morning. I suspect that swap might not have been set up correctly. So for now my plan is to switch the format of crypttab to match my home system, run mkinitrd and reboot. I'll report back how that works. If you want some other configuration tested let me know.
I confirmed that the swap fs was not luksopened. I changed my crypttab format to use UUID specifications and to include entries for the swap and root devices, enabled swap and rebuilt the initrd image. However I am still seeing the same symptoms. I need to enter a password for all 4 devices and the swap device is not opened after the reboot.
This machine has a very similar setup to the machine at home and things are working as expected there. The work machine is x86_64 and the home machine is i386.
I retested with mkinitrd-6.0.86-1.fc11.x86_64 and am still seeing the same symptoms.
Created attachment 345512 [details]
init script built with -83 (works as expected)
I am going to include the two extracted init scripts from the initrd file to make it easier to compare the one that works as expected and the one that doesn't.
I am starting with the one that works.
Created attachment 345513 [details]
init script made by -86 (does not work properly)
Created attachment 345552 [details]
init made with -86 on home machine that works properly
initrd files built with the -86 version of mkinitrd on my home machine work properly.
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.
More information and reason for this action is here:
Created attachment 347218 [details]
patch against mkinitrd-6.0.86-2.fc11.i586
Argh, took a couple of hours to track the problem I had after an upgrade from F10 to F11 down to this bug...
The initrd created on my machine with mkinitrd-6.0.86-2.fc11.i586 won't boot, the breakage is caused by the change in mkinitrd's emitcrypto() as Martin Ebourne pointed out.
The cause is pretty stupid, and so is the fix : my /etc/crypttab has its fields separated by TABs instead of spaces, hence `grep "^$2 " /etc/crypttab` will always return nothing. I created my /etc/crypttab by hand and I did not use UUIDs (I know, I should, I'll change this...) so the workaround proposed by Peter Jones doesn't work either.
Plymouth ask-for-password ends up trying to execute "cryptsetup luksOpen <device> <luksname>" with <device> being an empty string, which has no chance to work.
Attached patch fixes the problem.
I have this same issue since getting F10 upgraded to F11.
F10 was the original Fedora installed.
I got a line like:
luks-4a65c764-b0b1-4b1f-94fb-c76d1bc3e287 UUID=4a65c764-b0b1-4b1f-94fb-c76d1bc3e287 none
in crypttab while /dev/md1 is the encrypted device.
How is that supposed top work?!
Not even the recent patch, thanks for that, can fix this.
In other words:
How can I make this work again?
This issue is blocking any kernel upgrades and could render a system 99.8% unoperable. So priority and severity are severely understated.
Please do not forget to have the installer (anaconda) verified and possibly updated for this issue.
/dev/md1 is not the encrypted device, but md1 carries the encrypted device.
the name of the encrypted device on md1 is what needs to go in place of luks-blabla in the previous comment.
Then rebuild the ramdisk and stuff will work.
Thanks redhat for regression-testing this feature. (yes I know a tiny bit about testing)
For some additional information, I am seeing blkid differences between my system that works properly and the one that doesn't. In particular /dev/md1 is identified as a swap device in one case (when it shouldn't be, as it is encrypted and should show as an encrypted device) and not the other.
For the system that works:
bash-4.0$ blkid -s TYPE
For the system that doesn't work properly:
[bruno@cerberus ~]$ blkid -s TYPE
Since a previous comment mentioned tabs, I checked and found that I had tabs in the /etc/crypttab file on the problem machine. However replacing them with spaces didn't fix the problem.
Since the blkid output suggested a swap signature was being picked up on /dev/md1 and since it was just a swap area, I wiped the /dev/md1 block device and used cryptsetup to create a new luks device on it. And then made that device a swap device. I adjusted /etc/fstab and /etc/crypttab to use the new luks UUID. And now things work normally.
So it looks like the swap signature wasn't a problem before, but something changed to make it a problem.
(And I rebuilt the initrd image.)
*** Bug 518551 has been marked as a duplicate of this bug. ***
(In reply to comment #2)
> Created an attachment (id=344449) [details]
> Patch to work around empty/malformed crypttab.
> Also, can you test this patch to see if it correctly generates a working
Peter, would it be possible to emit some warning during the initrd generation when no matching entry is found in crypttab? Or at least when $2 does not look like an anaconda-generated name (i.e. luks-UUID, so no luks- prefix)?
I had a problem with setup as: encrypted / using luks device named "root" on top of /dev/vg0/root LV. But I had no entry for root in /etc/crypttab (yes, I know that's probably a misconfiguration on my side, crypttab was hand-made), as F10 mkinitrd had no problems with it and was able to figure out correct "cryptsetup luksOpen" arguments for / even without an entry in crypttab. F11 mkinitrd generated initrd that tried to luksOpen UUID=root and that failed.
This is a mass edit of all mkinitrd bugs.
Thanks for taking the time to file this bug report (and/or commenting on it).
As you may have heard in Fedora 12 mkinitrd has been replaced by dracut. In Fedora 12 the mkinitrd package is still around as some programs depend on
certain libraries it provides, but mkinitrd itself is no longer used.
In Fedora 13 mkinitrd will be removed completely. This means that all work
on initrd has stopped.
Rather then keeping mkinitrd bugs open and giving false hope they might get fixed we are mass closing them, so as to clearly communicate that no more work will be done on mkinitrd. We apologize for any inconvenience this may cause.
If you are using Fedora 11 and are experiencing a mkinitrd bug you cannot work around, please upgrade to Fedora 12. If you experience problems with the initrd in Fedora 12, please file a bug against dracut.
Are we sure that the core issue that causes this bug does not cause harm with
Please explain why the root cause is not a problem for dracut.
(In reply to comment #61)
> Are we sure that the core issue that causes this bug does not cause harm with
> Just closing a few bugs because a package disappears does not show a dedication
> for quality.
> Please explain why the root cause is not a problem for dracut.
dracut is a complete rewrite, not re-using any code, using a completely different
princicple to find the rootfs to boot. And the initial comment points out that
the problem is in the mkinitrd generated script, which is gone now.