Bug 1208214 - cannot upgrade with fedup on a system with luks encryption f21 -> f22
Summary: cannot upgrade with fedup on a system with luks encryption f21 -> f22
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: lorax
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Dennis Gilmore
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
Depends On:
Blocks: F22BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2015-04-01 16:38 UTC by James Hogarth
Modified: 2015-04-14 21:07 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-14 21:07:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
screen shot of VM with error (28.30 KB, image/png)
2015-04-01 16:39 UTC, James Hogarth
no flags Details
rdosreport from failed fedup boot (856.55 KB, text/plain)
2015-04-01 16:50 UTC, James Hogarth
no flags Details
a Beta TC1 system that does have a password prompt (31.63 KB, image/png)
2015-04-02 13:33 UTC, James Hogarth
no flags Details
a Beta TC2 system that does not have a password prompt (31.45 KB, image/png)
2015-04-02 13:36 UTC, James Hogarth
no flags Details

Description James Hogarth 2015-04-01 16:38:11 UTC
Description of problem:
No password prompt is shown to get credentials on booting to fedup.
After a period of time the attempt to find the root filesystem times out with the error "cannot find <root_filesystem>" and a recovery prompt is presented within the fedup initrd environment.

Version-Release number of selected component (if applicable):


How reproducible:
Everytime from a clean install of F21 workstation updated with current packages and the systemd from u-t).

Steps to Reproduce:
1. Install system and update it (including systemd from updates testing)... choose to encrypt the disk when configuring the disk layout in anaconda.
2. Install fedup
3. fedup --network 22 --nogpgcheck --instrepo https://dl.fedoraproject.org/pub/alt/stage/22_Beta_TC6/Workstation/x86_64/os/

Actual results:
Hangs at "reached target basic system" for a few minutes then results in the error where it can't find root partition.

Expected results:
Be prompted for a luks password and then carry out the update.

Additional info:

Comment 1 James Hogarth 2015-04-01 16:39:03 UTC
Created attachment 1009757 [details]
screen shot of VM with error

Comment 2 James Hogarth 2015-04-01 16:46:17 UTC
proposed as blocker for being unable to update a F21 system to F22 if encryption is used

Comment 3 James Hogarth 2015-04-01 16:50:50 UTC
Created attachment 1009759 [details]
rdosreport from failed fedup boot

Comment 4 Fedora Blocker Bugs Application 2015-04-01 16:57:21 UTC
Proposed as a Blocker for 22-beta by Fedora user jhogarth using the blocker tracking app because:

 This prevents someone who has a F21 systems with LUKS encryption from using fedup to upgrade to F22. Since encrypted systems are fairly common this is covered by the criterion:

"For each one of the release-blocking package sets, it must be possible to successfully complete an upgrade from a fully updated installation of the previous stable Fedora release with that package set installed. The user must be made to specify which Product (or none) they wish to have running when upgrade is complete."

Comment 5 Adam Williamson 2015-04-01 22:29:41 UTC
I can reproduce this when I use the TC6 instrepo (as James documents), but not when just doing '--network 22'.

fedora-install-22 is redirecting to Alpha at present:

https://mirrors.fedoraproject.org/mirrorlist?repo=fedora-install-22&arch=x86_64

so that means this broke somewhere between Alpha and Beta TC6. I'll try a few earlier TCs to see if we can nail it down somewhat.

+1 Beta blocker.

Comment 6 Adam Williamson 2015-04-01 23:01:46 UTC
Works with Beta TC1, broken with Beta TC2. Beta TC1 had systemd-219-6, Beta TC2 has systemd-219-8. dracut is the same in both.

systemd's changelog doesn't indicate much difference between -6 and -8, but the actual difference is rather larger than the changelog shows. http://pkgs.fedoraproject.org/cgit/systemd.git/commit/?id=e4a83a82af2b62b230df00829322bfc1e8028436 added 26 patches.

fedup hasn't changed at all throughout this, so I'm re-assigning to systemd for a start.

Comment 7 James Hogarth 2015-04-02 13:33:53 UTC
Created attachment 1010189 [details]
a Beta TC1 system that does have a password prompt

From what adamw said earlier took a more detailed look at Beta TC1 and TC2 behavioural differences.

In TC1 cryptsetup is present and systemd has a cryptsetup generator in place.

Comment 8 James Hogarth 2015-04-02 13:36:22 UTC
Created attachment 1010191 [details]
a Beta TC2 system that does not have a password prompt

looking at the content of the fedup initrd on a TC2 attempt cryptsetup is missing along with the systemd crypt generator.

Comment 9 James Hogarth 2015-04-02 13:47:30 UTC
Doing an lsinitrd against the upgrade.img from both TC1 and TC2 shows that the crypt stuff just appears to have been removed/missed in TC2+ ...

[hogarthj@hoglaptop Downloads]$ lsinitrd upgrade-beta-tc2.img  | grep crypt | grep -v kernel
-rwxr-xr-x   1 root     root        36576 Feb 23 15:36 usr/lib64/libcrypt-2.21.so
-rwxr-xr-x   1 root     root      2013384 Jan 16 15:24 usr/lib64/libcrypto.so.1.0.1k
lrwxrwxrwx   1 root     root           19 Mar 16 09:38 usr/lib64/libcrypto.so.10 -> libcrypto.so.1.0.1k
lrwxrwxrwx   1 root     root           16 Mar 16 09:38 usr/lib64/libcrypt.so.1 -> libcrypt-2.21.so
-rwxr-xr-x   1 root     root       947256 Mar 13 14:57 usr/lib64/libgcrypt.so.20.0.3
lrwxrwxrwx   1 root     root           19 Mar 16 09:38 usr/lib64/libgcrypt.so.20 -> libgcrypt.so.20.0.3
-rwxr-xr-x   1 root     root       206344 Feb 13 21:19 usr/lib64/libk5crypto.so.3.1
lrwxrwxrwx   1 root     root           18 Mar 16 09:38 usr/lib64/libk5crypto.so.3 -> libk5crypto.so.3.1
-rw-r--r--   1 root     root          366 Mar 10 13:58 usr/lib/systemd/system/cryptsetup.target
[hogarthj@hoglaptop Downloads]$ lsinitrd upgrade-beta-tc1.img  | grep crypt | grep -v kernel
crypt
-rwxr-xr-x   1 root     root        36576 Feb 23 15:36 usr/lib64/libcrypt-2.21.so
-rwxr-xr-x   1 root     root      2013384 Jan 16 15:24 usr/lib64/libcrypto.so.1.0.1k
lrwxrwxrwx   1 root     root           19 Mar 11 10:13 usr/lib64/libcrypto.so.10 -> libcrypto.so.1.0.1k
-rwxr-xr-x   1 root     root       166504 Aug 16  2014 usr/lib64/libcryptsetup.so.4.6.0
lrwxrwxrwx   1 root     root           22 Mar 11 10:13 usr/lib64/libcryptsetup.so.4 -> libcryptsetup.so.4.6.0
lrwxrwxrwx   1 root     root           16 Mar 11 10:13 usr/lib64/libcrypt.so.1 -> libcrypt-2.21.so
-rwxr-xr-x   1 root     root       943920 Jan 14 16:07 usr/lib64/libgcrypt.so.20.0.2
lrwxrwxrwx   1 root     root           19 Mar 11 10:13 usr/lib64/libgcrypt.so.20 -> libgcrypt.so.20.0.2
-rwxr-xr-x   1 root     root       206344 Feb 13 21:19 usr/lib64/libk5crypto.so.3.1
lrwxrwxrwx   1 root     root           18 Mar 11 10:13 usr/lib64/libk5crypto.so.3 -> libk5crypto.so.3.1
-rwxr-xr-x   1 root     root         6788 Jan 31 11:54 usr/lib/dracut-crypt-lib.sh
-rwxr-xr-x   1 root     root         3166 Jan 31 11:54 usr/lib/dracut/hooks/cmdline/30-parse-crypt.sh
-rw-r--r--   1 root     root          366 Mar  4 01:29 usr/lib/systemd/system/cryptsetup.target
-rwxr-xr-x   1 root     root        74400 Mar  4 01:30 usr/lib/systemd/systemd-cryptsetup
-rwxr-xr-x   1 root     root        70480 Mar  4 01:30 usr/lib/systemd/system-generators/systemd-cryptsetup-generator
lrwxrwxrwx   1 root     root           20 Mar 11 10:12 usr/lib/systemd/system/sysinit.target.wants/cryptsetup.target -> ../cryptsetup.target
-rwxr-xr-x   1 root     root         4226 Jan 31 11:54 usr/sbin/cryptroot-ask
-rwxr-xr-x   1 root     root          777 Jan 31 11:54 usr/sbin/crypt-run-generator
-rwxr-xr-x   1 root     root        61080 Aug 16  2014 usr/sbin/cryptsetup

Are there logs from the upgrade.img build process that can be checked to try to see why there is this difference?

The diff of the systemd release wouldn't appear to have accounted for this behaviour.

Comment 10 Zbigniew Jędrzejewski-Szmek 2015-04-02 14:29:08 UTC
(In reply to James Hogarth from comment #9)
> The diff of the systemd release wouldn't appear to have accounted for this
> behaviour.
Just to confirm: there should be no cryptsetup related changes in systemd (apart from a cosmetic warning removal, but I don't see how this could be related).

Comment 11 Adam Williamson 2015-04-02 15:55:53 UTC
That's pretty odd, as the things involved in upgrade.img generation are dracut and fedup-dracut, neither of which changed between TC1 and TC2.

New working theory: something changed in the packaging which results in the bits not being present for dracut to pull in when building the upgrade.img , or it not being able to find them, or something along those lines. I'll look into it today.

Comment 12 James Hogarth 2015-04-02 17:17:28 UTC
(In reply to awilliam from comment #11)
> That's pretty odd, as the things involved in upgrade.img generation are
> dracut and fedup-dracut, neither of which changed between TC1 and TC2.
> 
> New working theory: something changed in the packaging which results in the
> bits not being present for dracut to pull in when building the upgrade.img ,
> or it not being able to find them, or something along those lines. I'll look
> into it today.

The behaviour looks like dracut didn't use/run the module 90crypt for some reason when update.img was created...

I'm not sure where to look (if they are publically visible) for the logs of the process that generates update.img though to check what dracut did at the time.

Comment 13 Adam Williamson 2015-04-02 17:30:29 UTC
dracut's basically terrible at logging anyway, so if they are available (I don't know, dgilmore would), they probably won't have anything useful in them. I'll just have to poke through the code and the changes to the relevant packages and figure it out from there.

Comment 14 Adam Williamson 2015-04-02 17:48:30 UTC
Ah - from nightly pungify logs:

https://kojipkgs.fedoraproject.org/mash/branched-20150320/logs/pungify-x86_64.log
https://kojipkgs.fedoraproject.org/mash/branched-20150402/logs/pungify-x86_64.log

it looks like cryptsetup might have stopped being pulled into the generation environment at some point. The 03-20 and 04-02 nightly upgrade.img files have the same difference between them as TC1 and TC2, as James showed in #c9: the 03-20 nightly image has all the cryptsetup bits, the 04-02 nightly image does not. And if you look at the logs linked, the 'cryptsetup' package is included in the installed packages for the 03-20 nightly, but not for the 04-02 nightly.

Comment 15 James Hogarth 2015-04-02 18:08:14 UTC
(In reply to awilliam from comment #14)
> Ah - from nightly pungify logs:
> 
> https://kojipkgs.fedoraproject.org/mash/branched-20150320/logs/pungify-
> x86_64.log
> https://kojipkgs.fedoraproject.org/mash/branched-20150402/logs/pungify-
> x86_64.log
> 
> it looks like cryptsetup might have stopped being pulled into the generation
> environment at some point. The 03-20 and 04-02 nightly upgrade.img files
> have the same difference between them as TC1 and TC2, as James showed in
> #c9: the 03-20 nightly image has all the cryptsetup bits, the 04-02 nightly
> image does not. And if you look at the logs linked, the 'cryptsetup' package
> is included in the installed packages for the 03-20 nightly, but not for the
> 04-02 nightly.

Has mock changed from using yum to dnf yet as a result of the dnf policy change around the date of these logs?

The timing feels suspicious if it ends up being a dependency behavior alteration.

Comment 16 Adam Williamson 2015-04-02 18:18:23 UTC
ze smoking gun:

http://pkgs.fedoraproject.org/cgit/python-blivet.git/commit/?h=f22&id=39e7752068dfb0d4de27521992db164cdd959513

Here's the TC1 and TC2 logs:

https://ausil.fedorapeople.org/22_Beta_TC1/logs/
https://ausil.fedorapeople.org/22_Beta_TC2/logs/

if you compare a log for TC1 vs. a log for TC2 - e.g. https://ausil.fedorapeople.org/22_Beta_TC1/logs/Workstation.x86_64.log and https://ausil.fedorapeople.org/22_Beta_TC2/logs/Workstation.x86_64.log - you see that 'cryptsetup' is installed in TC1, but not in TC2. You can also see what pulled it in for TC1:

yum.verbose.YumBase.DEBUG: TSINFO: Marking cryptsetup-1.6.6-1.fc22.x86_64 as install for 1:python-blivet-1.0-1.fc22.noarch

so then we looked at python-blivet, and the above commit is when it dropped its dep on cryptsetup-python (which requires cryptsetup), between 1.0-1 and 1.0.1-1 - which is the TC1 vs. TC2 diff.

So that all adds up nicely. The only question now is, what's the correct way to get it back into the upgrade.img generation environment.

Comment 17 Brian Lane 2015-04-02 18:56:30 UTC
https://github.com/rhinstaller/lorax/pull/14

it is useful in rescue mode too, so just add it to lorax's install template.

Comment 18 Fedora Update System 2015-04-02 20:59:19 UTC
lorax-22.9-1.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/lorax-22.9-1.fc22

Comment 19 Tim Flink 2015-04-03 01:45:58 UTC
+1 blocker - encrypted systems are a valid upgrade platform and should be upgrade-able.

Comment 20 Dennis Gilmore 2015-04-03 01:50:25 UTC
+1 blocker

Comment 21 Adam Williamson 2015-04-03 01:56:11 UTC
+3 blocker, accepting.

Comment 22 Lukas Brabec 2015-04-03 13:34:39 UTC
I was able to upgrade 21 to 22 using Beta TC7. However, when the upgrade was done, machine didn't reboot, it hung after umounting (failed to umount /sysroot/proc). 
Besides this problem, the F22 system (after forced reset) normally booted.

Comment 23 Fedora Update System 2015-04-04 16:31:34 UTC
Package lorax-22.9-1.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing lorax-22.9-1.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-5520/lorax-22.9-1.fc22
then log in and leave karma (feedback).

Comment 24 James Hogarth 2015-04-09 05:43:21 UTC
Using the most recent TC I was able to fedup my encrypted system okay.

Comment 25 Kamil Páral 2015-04-09 16:03:13 UTC
I have verified this in bug 1207251 comment 15.

Comment 26 Adam Williamson 2015-04-14 21:07:39 UTC
Update has gone stable, closing.


Note You need to log in before you can comment on or make changes to this bug.