Bug 979723 - Boot failure after F18->F19 upgrade due to lvm.conf changes
Boot failure after F18->F19 upgrade due to lvm.conf changes
Status: CLOSED EOL
Product: Fedora
Classification: Fedora
Component: fedup-dracut (Show other bugs)
19
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Will Woods
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-29 17:27 EDT by Eugene Mah
Modified: 2015-02-17 10:45 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-02-17 10:45:12 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Log from /var/log/boot.log (13.73 KB, text/x-log)
2013-06-29 17:27 EDT, Eugene Mah
no flags Details
Log from /var/log/dracut.log (36.50 KB, text/plain)
2013-06-29 17:28 EDT, Eugene Mah
no flags Details
Output from journalctl -xb (164.86 KB, text/x-log)
2013-06-29 17:28 EDT, Eugene Mah
no flags Details
bzip2 compressed fedup.log (111.55 KB, application/x-bzip)
2013-06-29 17:36 EDT, Eugene Mah
no flags Details
diff -c lvm.conf lvm.conf.rpmnew (37.79 KB, text/plain)
2013-07-09 18:12 EDT, Nickolay Bunev
no flags Details

  None (edit)
Description Eugene Mah 2013-06-29 17:27:05 EDT
Created attachment 766945 [details]
Log from /var/log/boot.log

Description of problem:
Wanted to update my system to F19, so I used the fedup procedure documented in the fedora project wiki. After rebooting following the install of the F19 packages, the system dumped me into the emergency mode prompt with the following errors

* Timed out waiting for device dev-mapper-vg_hadron\x2dlv_home.device.
* Dependency failed for /home.
* Dependency failed for Local File Systems.
* Dependency failed for Relabel all filesystems, if necessary.
* Dependency failed for Mark the need to relabel after reboot.

The problem encountered seems to be similar to that described in bug 958586 (https://bugzilla.redhat.com/show_bug.cgi?id=958586)

For some reason, the /home LVM volume group is not being activated and found at boot time, resulting in the timeout error.

At the emergency prompt, I can enter 'lvchange -aay vg_hadron/lv_home' which activates and mounts the /home volume. Hitting CTRL-D to resume the boot process allows the machine to boot normally. However, at the next reboot the same problem happens.

Version-Release number of selected component (if applicable):
Fedora 18/19

How reproducible:
Unsure if it can be reproduced.

Steps to Reproduce:
1. At the command line, 'fedup-cli --network 19'
2. Reboot after packages are downloaded
3. Select the upgrade option at the GRUB menu
4. Reboot after packages have been installed

Actual results:
Boot process times out and puts me at the emergency mode prompt

Expected results:
System should boot up normally

Additional info:
System consists of several hard drives only one of which is LVM formatted with two volumes.

lvs output before executing the lvchange command
  LV      VG        Attr      LSize   Pool Origin Data%  Move Log Copy%  Convert
  lv_home vg_hadron -wi------ 411.25g                                           
  lv_root vg_hadron -wi-ao---  54.00g                                           

lvs output after executing the lvchange command
  LV      VG        Attr      LSize   Pool Origin Data%  Move Log Copy%  Convert
  lv_home vg_hadron -wi-ao--- 411.25g                                           
  lv_root vg_hadron -wi-ao---  54.00g                                           

System was previously upgraded from F17->F18 via preupgrade

Another system with just one LVM volume upgraded just fine, although that one was an upgrade from F17->F18.
Comment 1 Eugene Mah 2013-06-29 17:28:00 EDT
Created attachment 766946 [details]
Log from /var/log/dracut.log
Comment 2 Eugene Mah 2013-06-29 17:28:48 EDT
Created attachment 766947 [details]
Output from journalctl -xb
Comment 3 Eugene Mah 2013-06-29 17:36:47 EDT
Created attachment 766948 [details]
bzip2 compressed fedup.log
Comment 4 Will Woods 2013-07-01 14:44:16 EDT
(In reply to Eugene Mah from comment #0)
> System was previously upgraded from F17->F18 via preupgrade

..do you mean fedup? Because F18 doesn't support preupgrade.

Anyway, from the logs:

  20:57:19 hadron kernel: nvidia: module license 'NVIDIA' taints kernel.
  20:57:19 hadron kernel: Disabling lock debugging due to kernel taint

:(

  20:57:34 hadron lvm[357]: /dev/sdf: open failed: No medium found
  20:57:45 hadron lvm[357]: /dev/sdg: open failed: No medium found
  20:57:51 hadron systemd-udevd[383]: worker [393] [..]/sda2 timeout; kill it
  20:57:51 hadron systemd-udevd[383]: seq 2321 '[..]/sda2' killed
  20:57:51 hadron systemd-udevd[383]: worker [393] terminated by signal 9
  20:57:52 hadron lvm[357]: /dev/sdh: open failed: No medium found
  20:57:52 hadron lvm[357]: /dev/sdi: open failed: No medium found
  20:57:52 hadron lvm[357]: 1 logical volume(s) in volume group "vg_hadron" monitored
  20:58:44 hadron systemd[1]: Job dev-mapper-vg_hadron\x2dlv_home.device/start timed out.

So: it looks like a bunch of your disks are missing/unreadable, so /home can't be mounted, so the upgrade can't start?

What disks is /home using? Any ideas why /def/sd[fghi] and/or /dev/sda2 might not be readable?
Comment 5 Eugene Mah 2013-07-01 15:26:46 EDT
(In reply to Will Woods from comment #4)
> (In reply to Eugene Mah from comment #0)
> > System was previously upgraded from F17->F18 via preupgrade
> 
> ..do you mean fedup? Because F18 doesn't support preupgrade.

hmm, i'm pretty sure I used preupgrade to go from F17 to F18. That or I booted from a DVD to upgrade the system. There were several systems I was upgrading at the time, so I may be mixing them up.
The F18 to F19 upgrade was done using fedup.

> 
> Anyway, from the logs:
> 
>   20:57:19 hadron kernel: nvidia: module license 'NVIDIA' taints kernel.
>   20:57:19 hadron kernel: Disabling lock debugging due to kernel taint
> 
> :(
> 
>   20:57:34 hadron lvm[357]: /dev/sdf: open failed: No medium found
>   20:57:45 hadron lvm[357]: /dev/sdg: open failed: No medium found
>   20:57:51 hadron systemd-udevd[383]: worker [393] [..]/sda2 timeout; kill it
>   20:57:51 hadron systemd-udevd[383]: seq 2321 '[..]/sda2' killed
>   20:57:51 hadron systemd-udevd[383]: worker [393] terminated by signal 9
>   20:57:52 hadron lvm[357]: /dev/sdh: open failed: No medium found
>   20:57:52 hadron lvm[357]: /dev/sdi: open failed: No medium found
>   20:57:52 hadron lvm[357]: 1 logical volume(s) in volume group "vg_hadron"
> monitored
>   20:58:44 hadron systemd[1]: Job
> dev-mapper-vg_hadron\x2dlv_home.device/start timed out.
> 
> So: it looks like a bunch of your disks are missing/unreadable, so /home
> can't be mounted, so the upgrade can't start?
> 
> What disks is /home using? Any ideas why /def/sd[fghi] and/or /dev/sda2
> might not be readable?

The / and /home volumes are on /dev/sda2. I'm not sure why the system was looking for drives on /dev/sd[fghi]. there are drives at /dev/sd[abcd], and an external drive that was assigned /dev/sdk. i'm guessing that lvm thought there should be devices on /dev/sd[fghi]?

The upgrade to F19 had completed (I believe). Ran 'fedup-cli --network 19', watched it download a few thousand packages and then rebooted. Selected the upgrade option at the GRUB menu and watched it start installing the packages previously downloaded. I left it alone to finish installing everything and when I came back, the computer appeared to have rebooted but stopped when the vg_hadron/lv_home volume couldn't be found.

Checking with the lvm utilities, vg_hadron/lv_home was in an inactive state which I think explains why it couldn't be found during the boot process. Reactivating it manually (using lvchange) lets the boot process continue. Why it ended up inactive in the first place is a mystery to me.

Prior to the upgrade, everything was working fine under F18 and wasn't having any problems with the boot process.
Comment 6 Nickolay Bunev 2013-07-09 18:12:23 EDT
Created attachment 771282 [details]
diff -c lvm.conf lvm.conf.rpmnew

I am not sure whether I hit the same bug, but at least it's pretty similar to this one and bug 958586 as well. The only difference is that in my case the missing lvm volume groups were not /home or / but just my data partitions.
My system is pretty old, I don't even remember what was the native version installed back in 2007/2008. It's updated to the next version ever since, either via yum or some different tool.
Last updates:
F15 to F17 via preupgrade, F17 to 18 via fedup (without any issues) and F18 to F19 via fedup (network).

After the update and reboot I was dumped to the emergency mode with the same "timed out waiting for device" error. The workaround was to comment the volumes in question in fstab and mount them manually after reboot (and after lvscan / lvchange / vgchange).
I lost 3 hours in order to troubleshoot the problem playing around with systemd (systemctl lvm2.monitor-service) and the relevant services mentioned in bug 843587. I thought that it might be related to some problem caused by initramfs or dracut (there are similar older bug reports)
In the end I saw that there is lvm.conf.rpmnew in /etc/lvm dated May, 14. After applying / copying it over to my old lvm.conf my system booted without any problem. I rebooted once back with my old lvm.conf just to confirm that this fixed my issue.

I am attaching the diff between the lvm.conf.old and lvm.conf.rpmnew
Comment 7 Will Woods 2013-07-10 13:58:16 EDT
(In reply to Nickolay Bunev from comment #6)
> After the update and reboot I was dumped to the emergency mode with the same
> "timed out waiting for device" error. The workaround was to comment the
> volumes in question in fstab and mount them manually after reboot (and after
> lvscan / lvchange / vgchange).

Which reboot - the one to start the upgrade, or the post-upgrade reboot into F19?

> I lost 3 hours in order to troubleshoot the problem playing around with
> systemd (systemctl lvm2.monitor-service) and the relevant services mentioned
> in bug 843587. I thought that it might be related to some problem caused by
> initramfs or dracut (there are similar older bug reports)

Sorry for the trouble, but thanks for spending the time to debug the problem!

> In the end I saw that there is lvm.conf.rpmnew in /etc/lvm dated May, 14.
> After applying / copying it over to my old lvm.conf my system booted without
> any problem. I rebooted once back with my old lvm.conf just to confirm that
> this fixed my issue.

Hmm! That's very interesting - good catch!

You said the file is dated May 14, which is the date of the most recent F19 build of lvm2 - so I'm guessing this means that your F18->F19 upgrade completed successfully but the post-upgrade reboot into F19 didn't work. Is that right?

> I am attaching the diff between the lvm.conf.old and lvm.conf.rpmnew

There's definitely some substantial changes here!

It might be the case that lvm2 needs to merge some of these new settings into your existing /etc/lvm.conf to ensure your system will still boot.

To the original reporter (Eugene Mah) - do you also have /etc/lvm.conf.rpmnew?
If you copy that over /etc/lvm.conf (save a copy of your original /etc/lvm.conf first!), does your system boot?
Comment 8 Eugene Mah 2013-07-10 14:59:17 EDT
(In reply to Will Woods from comment #7)
> 
> To the original reporter (Eugene Mah) - do you also have
> /etc/lvm.conf.rpmnew?
> If you copy that over /etc/lvm.conf (save a copy of your original
> /etc/lvm.conf first!), does your system boot?

unfortunately, during the course of tinkering on my system (unrelated to this bug), I kind of messed things up and had to reinstall so I won't be able to check the lvm.conf file.
Comment 9 Nickolay Bunev 2013-07-10 15:04:53 EDT
(In reply to Will Woods from comment #7)
> (In reply to Nickolay Bunev from comment #6)
> > After the update and reboot I was dumped to the emergency mode with the same
> > "timed out waiting for device" error. The workaround was to comment the
> > volumes in question in fstab and mount them manually after reboot (and after
> > lvscan / lvchange / vgchange).
> 
> Which reboot - the one to start the upgrade, or the post-upgrade reboot into
> F19?
> 

Post-upgrade. I started the upgrade and left it overnight but it finished without problems, There is nothing in fedup.log or upgrade.log and the system booted without any problems when I removed the lvm partitions in question from /etc/fstab

> > I lost 3 hours in order to troubleshoot the problem playing around with
> > systemd (systemctl lvm2.monitor-service) and the relevant services mentioned
> > in bug 843587. I thought that it might be related to some problem caused by
> > initramfs or dracut (there are similar older bug reports)
> 
> Sorry for the trouble, but thanks for spending the time to debug the problem!
> 

No worries. I was a little bit confused whether do I need to start lvm2-lvmetad.service and lvm2-lvmetad.socket in order to mount my LVM and  partitions or not and I lost some time in that direction.

> > In the end I saw that there is lvm.conf.rpmnew in /etc/lvm dated May, 14.
> > After applying / copying it over to my old lvm.conf my system booted without
> > any problem. I rebooted once back with my old lvm.conf just to confirm that
> > this fixed my issue.
> 
> Hmm! That's very interesting - good catch!
> 
> You said the file is dated May 14, which is the date of the most recent F19
> build of lvm2 - so I'm guessing this means that your F18->F19 upgrade
> completed successfully but the post-upgrade reboot into F19 didn't work. Is
> that right?
> 

Right. I am upgrading this box for quite a long time and I know that my case might be quite different from others. 
[root@Pegasus ~]# tune2fs -l /dev/sda1 | grep created                                                                              
Filesystem created:       Sat Aug 18 05:47:59 2007

Sadly at the moment I can't see what was the Date created of the lvm.conf.old and I don't have any backup of /etc/lvm but after applying the changes from the rpmnew file, the issue was resolved.

> > I am attaching the diff between the lvm.conf.old and lvm.conf.rpmnew
> 
> There's definitely some substantial changes here!
> 
> It might be the case that lvm2 needs to merge some of these new settings
> into your existing /etc/lvm.conf to ensure your system will still boot.
> 
> To the original reporter (Eugene Mah) - do you also have
> /etc/lvm.conf.rpmnew?
> If you copy that over /etc/lvm.conf (save a copy of your original
> /etc/lvm.conf first!), does your system boot?
Comment 10 Stuart Auchterlonie 2014-04-23 15:24:45 EDT
(In reply to Will Woods from comment #7)
> (In reply to Nickolay Bunev from comment #6)
> > In the end I saw that there is lvm.conf.rpmnew in /etc/lvm dated May, 14.
> > After applying / copying it over to my old lvm.conf my system booted without
> > any problem. I rebooted once back with my old lvm.conf just to confirm that
> > this fixed my issue.
> 
> Hmm! That's very interesting - good catch!
> 
> You said the file is dated May 14, which is the date of the most recent F19
> build of lvm2 - so I'm guessing this means that your F18->F19 upgrade
> completed successfully but the post-upgrade reboot into F19 didn't work. Is
> that right?
> 
> > I am attaching the diff between the lvm.conf.old and lvm.conf.rpmnew
> 
> There's definitely some substantial changes here!
> 
> It might be the case that lvm2 needs to merge some of these new settings
> into your existing /etc/lvm.conf to ensure your system will still boot.
> 
> To the original reporter (Eugene Mah) - do you also have
> /etc/lvm.conf.rpmnew?
> If you copy that over /etc/lvm.conf (save a copy of your original
> /etc/lvm.conf first!), does your system boot?

I've hit this after a f19->f20 upgrade. Replacing lvm.conf with lvm.conf.rpmnew and rebuilding the initrd with dracut sorted this issue. I have both the original and rpmnew lvm.conf files if you would like them :)
Comment 11 Fedora End Of Life 2015-01-09 13:35:12 EST
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 12 Fedora End Of Life 2015-02-17 10:45:12 EST
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.