Bug 432813

Summary: 2.6.25 kernel can't decrypt encrypted root partition
Product: [Fedora] Fedora Reporter: Jeffrey C. Ollie <jeff>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: cra, eric, katzj, mauricio.teixeira, mbroz, raytodd, selinux, wwoods
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard: HotIssue
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-25 17:58:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 430962    
Attachments:
Description Flags
Verbose log from mkinitrd
none
Listing of the files in /boot/initrd-2.6.25-0.40.rc1.git2.fc9.img none

Description Jeffrey C. Ollie 2008-02-14 15:57:01 UTC
Description of problem:

When booting rawhide with 2.6.25 it is unable to decrypt my encrypted root
partition that contains my LVM volumes.

Version-Release number of selected component (if applicable):

2.6.25-0.35.rc1.fc9.i686

How reproducible:

Always

Steps to Reproduce:
1. Boot into 2.6.25-0.35.rc1.fc9.i686
2. When prompted, enter password to decrypt partition.
3.
  
Actual results:

Can't decrypt partition and boot fails.

Expected results:

Partition is decrypted and system continues to boot up.  Works fine in
2.6.24.1-26.fc9.i686.

Additional info:

Screen shots of the failed boot can be found here:

http://jcollie.fedorapeople.org/2625-luks-1.jpg
http://jcollie.fedorapeople.org/2625-luks-2.jpg

Comment 1 Jeffrey C. Ollie 2008-02-14 23:58:14 UTC
# mkinitrd -f /boot/initrd-2.6.25-0.40.rc1.git2.fc9.img 2.6.25-0.40.rc1.git2.fc9
no path for essiv

# dmsetup table
luks-sda2: 0 77753568 crypt aes-cbc-essiv:sha256
00000000000000000000000000000000 0 8:2 1032
pc21225-swap: 0 8388608 linear 253:0 69271936
pc21225-root: 0 69271552 linear 253:0 384


Comment 2 Jeffrey C. Ollie 2008-02-15 00:00:39 UTC
Created attachment 294957 [details]
Verbose log from mkinitrd

Comment 3 Will Woods 2008-02-18 22:29:14 UTC
See bug 433078, which is the root cause of this failure.

Comment 4 Jeffrey C. Ollie 2008-02-18 22:37:27 UTC
http://jcollie.fedorapeople.org/2625-luks-3.jpg

Definitely appears that #433078 is what's wrong... going to close as a duplicate.

*** This bug has been marked as a duplicate of 433078 ***

Comment 5 Will Woods 2008-02-18 22:54:20 UTC
Blarrrg! I was wrong in bug 433078 - it's not modprobe that's the problem. 

Comment 6 Will Woods 2008-02-18 22:54:54 UTC
*** Bug 433078 has been marked as a duplicate of this bug. ***

Comment 7 Will Woods 2008-02-19 19:21:27 UTC
So, yes, something is wrong in kernel-land. The key messages are:

device-mapper: table: 253:0: crypt: Error allocating crypto tfm
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping. 
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify
that /dev/sda2 contains at least 133 sectors.

I guess that means either:

1) we're not loading the AES modules
2) they're not providing needed bits (aes-cbc-essiv:sha256 cipher spec), or 
3) cryptsetup is failing to check for those bits properly.

It's probably *not* modprobe's fault, as determined in bug 433078. So that
leaves 2) or 3).

Comment 8 Will Woods 2008-02-19 22:00:46 UTC
*** Bug 433463 has been marked as a duplicate of this bug. ***

Comment 9 Ray Todd Stevens 2008-02-19 22:02:42 UTC
I also tried to load the disk in rescue mode to see if I could see anything.   I
find it very interesting that there is no apparent even attempt to load the
encrypted volumes or to request a password for them.   This is very "not good".

Should I file a RFE for this one, or is that in progress or something.

Comment 10 Jeremy Katz 2008-02-19 22:07:12 UTC
Rescue mode pieces are in progress

Comment 11 Ray Todd Stevens 2008-02-19 22:08:57 UTC
If it helps any this system was working fine, and then I ran the automatic
update and broke it.  This is also an experimental system.  I am pretty sure I
could wipe and reload and get a working system, that I could then try separately
updating anything you want me to and see if that is what breaks it.

Comment 12 Will Woods 2008-02-19 22:10:52 UTC
Encryption support for rescue mode is bug 430786.

Comment 13 Jeremy Katz 2008-02-19 22:11:14 UTC
My laptop is running rawhide and with an encrypted root now, so I'll be able to
dig a little more once I get home and have the USB keys I foolishly left sitting
on the coffee table :)

Comment 14 Ray Todd Stevens 2008-02-19 22:11:49 UTC
For both this and the rescue mode peices, I have taken the  fc8 "why don't you
guys test the alphas" comments to heart.  I have 4-8 hours a week which they
have set aside for me to test things like this.

So let me know what you need.



Comment 15 Ray Todd Stevens 2008-02-19 22:13:35 UTC
You want a guess, I did notice that the update claimed to be updating some
kernel files.   Could this be the problem?

Comment 16 Will Woods 2008-02-19 22:16:43 UTC
Yes. It's something to do with kernel 2.6.25. We're working on it.

Comment 17 Jay Fenlason 2008-02-19 22:35:43 UTC
*** Bug 433022 has been marked as a duplicate of this bug. ***

Comment 18 Milan Broz 2008-02-20 10:50:05 UTC
There is change in 2.6.25 in dm-crypt to use async crypto (if available).

Which crypt modules are loaded in ramdisk?
Is there crypto_blkcipher module in the ramdisk?


Comment 19 Jeffrey C. Ollie 2008-02-20 13:33:07 UTC
Created attachment 295417 [details]
Listing of the files in /boot/initrd-2.6.25-0.40.rc1.git2.fc9.img

I've attached a list of the files in /boot/initrd-2.6.25-0.40.rc1.git2.fc9.img

Comment 20 Ray Todd Stevens 2008-02-20 14:42:45 UTC
I have the machine this has failed on "frozen".   From what I see here it sounds
like even once it is fixed that the solution to implement the fix may be to
simply reload.   So are you going to need victims to try your fix on, or should
I just reload and continue testing without doing to update which breaks things?

Comment 21 Will Woods 2008-02-20 15:16:04 UTC
You should be able to boot the previously working kernel (probably
2.6.24.1-31.fc9) from the GRUB menu, since yum is set up to save the previous
kernel by default.

So you can get the system working again just by booting the older kernel. When
we have a fixed kernel package, you'll just need to update the kernel and
everything should be OK.

Unless you did something strange like removing the older kernel, there should be
no need to reinstall.

Comment 22 Ray Todd Stevens 2008-02-20 17:07:38 UTC
OK this works.   On to trying to break other things.  ;-)   Basically I am going
through all my old bug reports and seeing if I can have problems doing the same
things.   So you probably will see more reports, but probably not to many we hope.

Comment 23 Milan Broz 2008-02-20 17:43:36 UTC
Seems that there is missing crypto module "chainiv.ko" in ramdisk for 2.6.25 kernel.

Please could anyone add it to ramdisk and test it again?

Comment 24 Milan Broz 2008-02-20 18:01:32 UTC
See http://lkml.org/lkml/2008/2/20/403

Comment 25 Jeffrey C. Ollie 2008-02-20 18:03:32 UTC
(In reply to comment #23)
> Seems that there is missing crypto module "chainiv.ko" in ramdisk for 2.6.25
kernel.
> 
> Please could anyone add it to ramdisk and test it again?

Woohoo! rebuilding my initrd with --with=chainiv and 2.6.25 is able to decrypt
my partition now...


Comment 26 Will Woods 2008-02-20 18:06:07 UTC
Works for me too. I rebuilt my initrd like so:

mkinitrd -f --with=chainiv 2.6.25-0.40.rc1.git2.fc9
/boot/initrd-2.6.25-0.40.rc1.git2.fc9.img

and now the system boots properly.

So - how long until this is fixed in the upstream kernel? Should we make
crypto_blkcipher depend on chainiv until then?


Comment 27 Milan Broz 2008-02-20 18:23:09 UTC
No idea about upstream, but for Fedora there should be some workaround asap.

Note that fresh network installation of rawhide (with disk encryption) now
produces completely unbootable system (there is no old kernel in repository,
just 2.6.25).


Comment 28 Charles R. Anderson 2008-02-22 16:34:30 UTC
*** Bug 433834 has been marked as a duplicate of this bug. ***

Comment 29 Charles R. Anderson 2008-02-22 16:35:59 UTC
From Bug 433834 this appears it may be fixed:

Comment #8 From Bob Agel (cragel.com) 	on 2008-02-21 14:34 EST 	[reply] 	 

Today's update to 2.6.25-0.54.rc2.fc9 fixes the problem - booting normally now.

Comment 30 Ray Todd Stevens 2008-02-22 18:52:42 UTC
Interesting booting this kernel has not fixed the problem here.


Comment 31 Will Woods 2008-02-22 20:15:10 UTC
There's basically no way that 2.6.25-0.54.rc2.fc9 could have fixed the problem,
since it made no changes to the crypt code. 

On the other hand, kernel-2.6.25-0.64.rc2.git5.fc9 has:

* Thu Feb 21 2008 Kyle McMartin <kmcmartin>
- crypto_blkcipher: big hack caused module dep loop, try another fix

I can confirm that it fixes the problem for me. It should appear with the next
rawhide update - or fetch it from koji in the meantime:

http://koji.fedoraproject.org/koji/buildinfo?buildID=39187

Comment 32 Charles R. Anderson 2008-02-23 04:24:34 UTC
2.6.25-0.64.rc2.git5.fc9 fixes it for me too.

Comment 33 eric 2008-02-23 18:53:24 UTC
*** Bug 434636 has been marked as a duplicate of this bug. ***

Comment 34 Ray Todd Stevens 2008-02-26 03:55:27 UTC
Confirming this is fixed in the latest release.  But that did break so things so
let me get on to creating some new bug reports.