Bug 952782

Summary: encrypted /home not activated properly on boot
Product: [Fedora] Fedora Reporter: Kevin Fenzi <kevin>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: agk, bmarzins, bmr, bruno, bzf, dwysocha, heinzm, jonathan, lvm-team, msnitzer, prajnoha, prockai, zkabelac
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.98-7.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 953867 (view as bug list) Environment:
Last Closed: 2013-04-25 14:06:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 953867    
Attachments:
Description Flags
journal output from boot
none
lvmdump none

Description Kevin Fenzi 2013-04-16 16:55:03 UTC
Created attachment 736456 [details]
journal output from boot

I have the following setup: 

NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                             8:0    0 119.2G  0 disk  
├─sda1                                          8:1    0   500M  0 part  /boot
└─sda2                                          8:2    0 118.8G  0 part  
  └─luks-8e6752fe-6111-4c54-8310-0c5d7ac1afe9 253:0    0 118.8G  0 crypt 
    ├─fedora-swap                             253:1    0   7.6G  0 lvm   [SWAP]
    ├─fedora-root                             253:2    0  19.5G  0 lvm   /
    └─fedora-home                             253:3    0  91.6G  0 lvm   /home
sr0                                            11:0    1  1024M  0 rom   

On boot I'm correctly prompted for my luks password, and root and swap volumes seem to come up fine, but then the boot hangs trying to activate the /home volume. If I set a timeout it times out and I can manually run vgchange -ay and mount it fine. 

I'll attach a journal from a boot where I let it timeout, manually mounted it and continued the boot process. 

Happy to gather more info or try things.

Comment 1 Peter Rajnoha 2013-04-17 08:55:54 UTC
Does it mount the "home" if you try setting the global/use_lvmetad=0 in /etc/lvm/lvm.conf?

Comment 2 Bruno Wolff III 2013-04-17 14:25:19 UTC
This looks a bit different than what I saw as my layout was one luks device per file system. But it is probably related to the host only dracut feature and is likely a dracut rather than lvm bug. Probably the lvm device /home is on isn't started because it isn't needed for the early boot. But then it still tries to mount /home for some reason, which fails. Maybe /etc/lvm/lvm.conf needs to get filtered, similarly to /etc/cryptab did to resolve my issue. You might be able to work around this with the rd.lvm.lv= kernel parameter.

Comment 3 Kevin Fenzi 2013-04-17 16:44:13 UTC
(In reply to comment #1)
> Does it mount the "home" if you try setting the global/use_lvmetad=0 in
> /etc/lvm/lvm.conf?

Yes. it works fine if I set that.

Comment 4 Peter Rajnoha 2013-04-18 12:49:59 UTC
Well, it seems the event-based autoactivation failed for some reason. The autoactivation takes place only if lvmetad is used.

When the timeout happens and when you are dropped into a rescue shell, could you please gather the output of following commands and, please, attach it here:

  systemctl status lvm2-lvmetad.socket lvm2-lvmetad.service

and

  lvmdump -amul

(the lvmdump will gather all the LVM info/context and packs it into a tgz)

Comment 5 Peter Rajnoha 2013-04-18 13:20:39 UTC
+ the content of /etc/fstab

Comment 6 Kevin Fenzi 2013-04-18 17:13:38 UTC
lvm2-lvmetad.socket - LVM2 metadata daemon socket
       Loaded: loaded (/usr/lib/systemd/system/lvm2-lvmetad.socket; enabled)
       Active: active (running) since Thu 2013-04-18 11:07:26 MDT; 38s ago
         Docs: man:lvmetad(8)
               man:lvmetad(8)
       Listen: /run/lvm/lvmetad.socket (Stream)


lvm2-lvmetad.service - LVM2 metadata daemon
   Loaded: loaded (/usr/lib/systemd/system/lvm2-lvmetad.service; disabled)
   Active: active (running) since Thu 2013-04-18 11:07:26 MDT; 38s ago
     Docs: man:lvmetad(8)
  Process: 460 ExecStart=/usr/sbin/lvmetad (code=exited, status=0/SUCCESS)
 Main PID: 462 (lvmetad)
   CGroup: name=systemd:/system/lvm2-lvmetad.service
           └─462 /usr/sbin/lvmetad 

Apr 18 11:07:26 jelerak.scrye.com systemd[1]: Started LVM2 metadata daemon.

#
# /etc/fstab
# Created by anaconda on Sun Feb 17 11:38:36 2013
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/fedora-root /                       ext4    defaults,x-systemd.device-timeout=0 1 1
UUID=8dd584df-fa57-4327-85b6-ef5fc931a454 /boot                   ext4    defaults        1 2
/dev/mapper/fedora-home /home                   ext4    defaults,x-systemd.device-timeout=10 1 2
/dev/mapper/fedora-swap swap                    swap    defaults,x-systemd.device-timeout=0 0 0

will attach lvmdump.

Comment 7 Kevin Fenzi 2013-04-18 17:14:27 UTC
Created attachment 737365 [details]
lvmdump

Comment 8 Peter Rajnoha 2013-04-19 10:51:26 UTC
OK, thanks for the logs. I've found the problem. It should be fixed with this patch:

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=764195207d4773cf6f1674a2fb16e9a0acda304a

The problem was (copying from the commit message):

Commit 756bcabbfe297688ba240a880bc2b55265ad33f0 fixed autoactivation to not trigger on each uevent for a PV that appeared in the system most notably the events that are triggered artificially (udevadm trigger or as the result of the WATCH udev rule being applied that consequently generates CHANGE uevents). This fixed a situation in which VGs/LVs were activated when they should not.

BUT we still need to care about the coldplug used at boot to retrigger the ADD events - the "udevadm trigger --action=add"!

For non-DM-based PVs, this is already covered as for these we run the autoactivation on ADD event only.

However, for DM-based PVs, we still need to run the autoactivation even for the artificial ADD event, reusing the udev DB content from previous proper CHANGE event that came with the DM device activation.

Simply, this patch fixes a situation in which we run extra "udevadm trigger --action=add" (or echo add > /sys/block/<dev>/uevent) for DM-based PVs (cryptsetup devices, multipath devices, any other DM devices...).

Without this patch, while using lvmetad + autoactivation, any VG/LV that has a DM-based PV and for which we do not call the activation directly, the VG/LV is not activated.

For example a VG with an LV with root FS on it which is directly activated in initrd and then missing activation of the rest of the LVs in the VG because of unhandled uevent retrigger on boot after switching to root FS (the "coldplug").

(the problematic commit mentioned above was not yet released, F19/rawhide contained that as an extra patch for the 2.02.98 base)

Comment 9 Fedora Update System 2013-04-19 12:50:29 UTC
lvm2-2.02.98-7.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/lvm2-2.02.98-7.fc19

Comment 10 Fedora Update System 2013-04-19 16:51:44 UTC
Package lvm2-2.02.98-7.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing lvm2-2.02.98-7.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-6103/lvm2-2.02.98-7.fc19
then log in and leave karma (feedback).

Comment 11 Fedora Update System 2013-04-25 14:06:09 UTC
lvm2-2.02.98-7.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 12 Marco Nolden 2013-11-17 04:50:49 UTC
This has re-appeared with lvm-2.02.103-3.fc20 . The workaround in comment 3 does not work for me, however a manual "pvscan --cache" works.