Bug 464580 - cannot access LUKS encrypted disk
cannot access LUKS encrypted disk
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: mkinitrd (Show other bugs)
10
All Linux
medium Severity medium
: ---
: ---
Assigned To: Peter Jones
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-29 12:36 EDT by Jeff Bastian
Modified: 2008-12-03 16:49 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-12-03 16:49:05 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Jeff Bastian 2008-09-29 12:36:26 EDT
Description of problem:
Following the YumUpgradeFaq on a test system, 
  http://fedoraproject.org/wiki/YumUpgradeFaq
I upgraded from Fedora 9 to Fedora 10 Alpha over the weekend.

I used LUKS disk encryption on Fedora 9.  After the upgrade, booting
with kernel-2.6.27-0.352.rc7.git1.fc10, it just keeps asking me for
my password over and over again.  I rebooted and removed 'rhgb' and
'quiet' from the kernel command line options, and now I see:
   Setting up disk encryption: /dev/sda2
   Password: ...
   Command failed: Can not access device

Fortunately, I still have kernel-2.6.26.3-29.fc9 installed, and that 
boots fine and is able to de-crypt the drive.

I used 'cryptsetup luksAddKey' to add a new password, but that didn't help.


Version-Release number of selected component (if applicable):
  kernel-2.6.27-0.352.rc7.git1.fc10.i686
  cryptsetup-luks-1.0.6-4.fc10.i386

How reproducible:
not sure

Steps to Reproduce:
1. Build a Fedora 9 i386 system, apply all updates
2. Follow the YumUpgradeFaq instructions to upgrade to Fedora 10 Alpha
3. Reboot
  
Actual results:
New kernel cannot access LUKS encrypted drive

Expected results:
Normal boot

Additional info:
This is a virtual machine running under VMware Fusion 2.0 with VMI enabled.
Comment 1 Jeff Bastian 2008-09-29 12:49:45 EDT
I just upgraded to
  kernel-2.6.27-0.354.rc7.git3.fc10.i686
  cryptsetup-luks-1.0.6-5.fc10
but that did not help.
Comment 2 Jeff Bastian 2008-09-30 12:55:40 EDT
Today I downloaded the Fedora 10 Beta Live CD and installed it under VMware Fusion 2.0.  It does the same thing after the installation finishes and I reboot: it asks for the password over and over again.

And if I remove 'rhgb quiet', it shows
  Command failed: can not access device
after entering the password.
Comment 3 Jeff Bastian 2008-09-30 13:18:16 EDT
I booted from the Live CD again and I was able to access the drive manually:
  [root@localhost ~]# cryptsetup luksOpen /dev/sda2 vg0
  Enter LUKS passphrase for /dev/sda2: 
  key slot 0 unlocked.
  Command successful.

vgscan, pvscan, etc all worked fine after that.
Comment 4 Jeff Bastian 2008-09-30 13:30:17 EDT
I poked inside the initrd.img and found that the init script was basically doing what I did in comment #3:

...
echo Setting up disk encryption: /dev/sda2
plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2 luks-d8f4aad6
-8e10-4189-9a10-64d55060947b"
echo Scanning logical volumes
lvm vgscan --ignorelockingfailure
...
Comment 5 Ray Strode [halfline] 2008-09-30 13:45:00 EDT
what's the output of:

rpm -qa |grep plymouth
ls /sbin/plymouth* 
/usr/sbin/plymouth-set-default-plugin

?
Comment 6 Jeff Bastian 2008-09-30 21:45:43 EDT
From the Fedora 10 Beta Live CD released this morning:

$ rpm -qa |grep plymouth
plymouth-gdm-hooks-0.6.0-0.2008.09.10.1.fc10.i386
plymouth-plugin-spinfinity-0.6.0-0.2008.09.10.1.fc10.i386
plymouth-libs-0.6.0-0.2008.09.10.1.fc10.i386
plymouth-plugin-label-0.6.0-0.2008.09.10.1.fc10.i386
plymouth-utils-0.6.0-0.2008.09.10.1.fc10.i386
plymouth-0.6.0-0.2008.09.10.1.fc10.i386


$ ls /sbin/plymouth* 
ls: cannot access /sbin/plymouth*: No such file or directory


$ /usr/sbin/plymouth-set-default-plugin
spinfinity


I'm trying a 'yum update' now on the installed system.
Comment 7 Ray Strode [halfline] 2008-09-30 21:53:08 EDT
if you edit /init and change

plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2
luks-d8f4aad6

to

plymouth --hide-splash
cryptsetup luksOpen /dev/sda2 luks-d8f4aad6-8e10-4189-9a10-64d55060947b
plymouth --show-splash

and then repack it with

find | cpio -o -c | gzip -9 > initrd.img

does it work?
Comment 8 Jeff Bastian 2008-09-30 22:25:34 EDT
It never asked for the password.  The splash screen just kept right on moving along, and when the progress bar was completely white, it was stuck.

I rebooted without 'rhgb quiet' and then I saw
   Setting up disk encryption: /dev/sda2
   Command failed: can not access device
and then it tried to continue running, but of course failed because it couldn't access the drive.

I'll do some more tests in the morning.
Comment 9 Jeff Bastian 2008-10-02 13:32:49 EDT
Update: I upgraded to
  kernel-2.6.27-0.377.rc8.git1.fc10
  plymouth-0.6.0-0.2008.09.25.2.fc10
and switched to the 'details' plymouth plugin, then built a new initrd.  It's still giving me the same
  Command failed: can not access device
error.


And editing the init script as described in comment #7 is still skipping right over asking for the password.


It "feels" like there's either a SCSI driver not loaded, or /dev/sda2 doesn't exist...  From the Live CD, I just tried
  cryptsetup luksOpen /dev/sdd1 luks-sdd1
and it gave the same error message
  Command failed: can not access device
since there is no /dev/sdd1.  It worked on /dev/sda2, though.

I wonder if node /dev/sda2 isn't getting created by mkblkdevs...?
Comment 10 Ray Strode [halfline] 2008-10-02 13:57:04 EDT
Any ideas, Peter?
Comment 11 Jeff Bastian 2008-10-02 14:07:46 EDT
(In reply to comment #9)
> I wonder if node /dev/sda2 isn't getting created by mkblkdevs...?


That looks like the problem!  I modified the 'init' script to create
/dev/sda{,1,2} with mknod and now my system boots!

--- init.ORIG   2008-10-02 13:03:57.000000000 -0500
+++ init        2008-10-02 13:04:23.000000000 -0500
@@ -69,6 +69,9 @@
 mkblkdevs
 echo Loading keymap.
 loadkeys -u us.map
+mknod /dev/sda b 8 0
+mknod /dev/sda1 b 8 1
+mknod /dev/sda2 b 8 2
 echo Setting up disk encryption: /dev/sda2
 plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2 luks-sda2"
 echo Scanning logical volumes


So, does this mean the root of the problem lies with nash or mkinitrd?
Comment 12 Ray Strode [halfline] 2008-10-02 14:20:37 EDT
yea, nash (or maybe the kernel?)

Moving to nash. Thanks for doing the detective work on this.
Comment 13 Jared Smith 2008-10-06 18:38:26 EDT
I ran into the same problem described above.

It turns out that when Fedora was installing, the hard drive in question showed up as /dev/sdf.  When I rebooted to the newly installed OS, the hard drive showed up as /dev/sda, but the plymouth command in initrd.img was still hard-coded to point at /dev/sdf.  I booted off a live-cd, unencrypted the volume, and ran lsinitrd on the initrd.img file, and it contained the line:

plymouth-ask-for-password --command "cryptsetup luksOpen /dev/sdf2 luks-<UUID>"

Is there any reason we have to hard-code the device there, and not use a UUID there?  Does anyone know if the cruptsetup command will take a UUID instead of a device node?

Hopefully that'll save someone else some problems, and we can get this fixed before F10 final.
Comment 14 Jared Smith 2008-10-09 10:08:12 EDT
It appears to me that we can use UUID=<uuid> or LABEL=<label> in the cryptsetup command, or worse comes to worse we could simply use /dev/disk/by-uuid/<uuid> or /dev/disk/by-label/<label>.  Either of these would be preferable to hard-coding the device node in the initrd image.
Comment 15 Paul W. Frields 2008-10-09 11:36:58 EDT
Triaged.
Comment 16 Jeff Bastian 2008-10-10 12:02:15 EDT
Using UUID did NOT fix the problem in my case.

That is, I modified the 'init' script in the initrd to
  plymouth ask-for-password --command "cryptsetup luksOpen UUID=___ luks-sda2"
and I filled in the UUID value with the output of
  cryptsetup luksUUID /dev/sda2

The system failed to boot with this.  It gave me the same error "Can not access device"

I then added the
  mknod /dev/sda b 8 0
  mknod /dev/sda1 b 8 1
  mknod /dev/sda2 b 8 2
lines to the init script and tried again, but it still failed

So, keeping the 'mknod' lines, I changed back to
  plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2 luks-sda2"
and I can boot again.
Comment 17 Will Woods 2008-10-30 17:56:11 EDT
(In reply to comment #14)
> It appears to me that we can use UUID=<uuid> or LABEL=<label> in the cryptsetup
> command

What makes you say that? It doesn't seem to work here. Tracing through the cryptsetup code, I see nothing that converts the given device name from UUID to block device. AFAICT the error message mentioned in comment #16 comes from LUKS_device_ready(), which does:
  int devfd = open(device, mode | O_DIRECT | O_SYNC);
where "device" is the devicename passed in argv[2]. 

> worse comes to worse we could simply use /dev/disk/by-uuid/<uuid>
> or /dev/disk/by-label/<label>.  

I don't think so - these are set up by udev, which isn't running yet.

> Either of these would be preferable to
> hard-coding the device node in the initrd image.

Fully agree. But I don't see an obvious way to make it work.
Comment 18 Jeff Bastian 2008-10-30 18:15:22 EDT
I updated to the latest rawhide today including
  kernel-2.6.27.4-58.fc10
  mkinitrd-6.0.68-1.fc10
  plymouth-0.6.0-0.2008.10.27.7.fc10
and I went to modify the init script inside the initrd (to add the 'mknod
/dev/sda' lines) and noticed a large chunk of the script was missing.

Comparing the init script from kernel -51.fc10 and -58.fc10:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# diff -u initrd-2.6.27.4-5{1,8}.fc10.i686/init
--- initrd-2.6.27.4-51.fc10.i686/init   2008-10-27 12:27:16.000000000 -0500
+++ initrd-2.6.27.4-58.fc10.i686/init   2008-10-30 15:30:48.000000000 -0500
@@ -42,32 +42,9 @@
 hotplug
 echo Creating block device nodes.
 mkblkdevs
-echo "Loading dm-crypt module"
-modprobe -q dm-crypt
-echo "Loading aes module"
-modprobe -q aes
-echo "Loading cbc module"
-modprobe -q cbc
-echo "Loading sha256 module"
-modprobe -q sha256
-echo "Loading scsi_transport_spi module"
-modprobe -q scsi_transport_spi
-echo "Loading mptbase module"
-modprobe -q mptbase
-echo "Loading mptscsih module"
-modprobe -q mptscsih
-echo "Loading mptspi module"
-modprobe -q mptspi
 echo Making device-mapper control node
 mkdmnod
 mkblkdevs
-mknod /dev/sda b 8 0
-mknod /dev/sda1 b 8 1
-mknod /dev/sda2 b 8 2
-echo Loading keymap.
-loadkeys -u us.map
-echo Setting up disk encryption: /dev/sda2
-plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2 luks-sda2"
 echo Scanning logical volumes
 lvm vgscan --ignorelockingfailure
 echo Activating logical volumes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Out of curiosity, I tried booting anyway, thinking/hoping this was on purpose,
but it did not ask me for my LUKS password.  After removing 'rhgb' and 'quiet'
from the kernel command line options, I saw:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Activating logical volumes
  Volume group "VolGroup00" not found
...
switchroot: mount failed: No such file or directory
Booting has failed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Not surprising since it didn't ask for the LUKS password...


I then tried rebuilding the initrd with all the modules:
   mkinitrd -f --with=dm-crypt --with=aes --with=cbc --with=sha256 \
     --with=scsi_transport_spi --with=mptbase --with=mptscsih --with=mptspi \
     /boot/initrd-2.6.27.4-58.fc10.i686.img 2.6.27.4-58.fc10.i686
and then added the missing lines to the init script
   mknod /dev/sda b 8 0
   mknod /dev/sda1 b 8 1
   mknod /dev/sda2 b 8 2
   echo Loading keymap.
   loadkeys -u us.map
   echo Setting up disk encryption: /dev/sda2
   plymouth ask-for-password --command "cryptsetup luksOpen /dev/sda2
luks-sda2"
but it still fails to boot.
Comment 19 Jeff Bastian 2008-10-30 18:31:01 EDT
(In reply to comment #18)
> I then tried rebuilding the initrd with all the modules:
...
> but it still fails to boot.


I also had to manually add
  /bin/cryptsetup
  /bin/loadkeys
  /lib/libcryptsetup.so.0
  /lib/libgcrypt.so.11
  /lib/libgpg-error.so.0
to the initrd and now I can boot again.
Comment 20 Charles R. Anderson 2008-10-30 19:50:03 EDT
Re Comment #18, 19: you need a new mkinitrd, see bug #468856.
Comment 21 Jeff Bastian 2008-10-31 10:46:27 EDT
(In reply to comment #20)
> Re Comment #18, 19: you need a new mkinitrd, see bug #468856.

I just updated to
   nash-6.0.69-1.fc10
   mkinitrd-6.0.69-1.fc10
and not only did it fix the new problem in comment #18 and comment #19, it also fixed the original problem from comment #11.  That is, I did NOT need to add the
   mknod /dev/sda b 8 0
   mknod /dev/sda1 b 8 1
   mknod /dev/sda2 b 8 2
lines to the init script like I've been doing.

I just generated a new initrd with
  mkinitrd /boot/initrd-$(uname -r).img $(uname -r)
and rebooted and it worked!
Comment 22 Bill Nottingham 2008-11-04 14:13:38 EST
Jeff - I think part of your issue was fixed with initscripts-8.85-1 (see bug 462371).

Jared - are you still seeing issues?
Comment 23 Tom "spot" Callaway 2008-11-10 15:21:17 EST
Lifting F10Blocker, this seems resolved to me (and I can now access my LUKS encrypted disk).
Comment 24 Jeff Bastian 2008-11-10 16:45:39 EST
Agreed, this seems resolved to me too.  I've upgraded the kernel a few times lately and the initrds have all been fine.
Comment 25 Bug Zapper 2008-11-25 22:16:29 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Note You need to log in before you can comment on or make changes to this bug.