Bug 1026860 - lvm2: services specified with SYSTEMD_WANTS are not started for some LVM volumes
lvm2: services specified with SYSTEMD_WANTS are not started for some LVM volumes
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: lvm2 (Show other bugs)
20
All Linux
medium Severity medium
: ---
: ---
Assigned To: Peter Rajnoha
Fedora Extras Quality Assurance
: Reopened
: 1038917 (view as bug list)
Depends On:
Blocks: 1044543
  Show dependency treegraph
 
Reported: 2013-11-05 09:42 EST by Peter Rajnoha
Modified: 2014-01-31 15:41 EST (History)
34 users (show)

See Also:
Fixed In Version: lvm2-2.02.103-4.fc20
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1044543 (view as bug list)
Environment:
Last Closed: 2014-01-14 20:21:56 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
output of systemctl show dev-block-9:0 and 9:127.device (1.42 KB, text/plain)
2013-11-05 11:31 EST, Clyde E. Kunkel
no flags Details
output from journalctl -xb (180.94 KB, text/plain)
2013-11-05 20:31 EST, Clyde E. Kunkel
no flags Details
journalctl -xb output (334.02 KB, text/plain)
2013-11-07 14:52 EST, Clyde E. Kunkel
no flags Details
journalctl -xb of lvm2-2.02.104-1.fc21 (718.50 KB, text/plain)
2013-11-17 18:51 EST, Clyde E. Kunkel
no flags Details
journalctl -xb with lots of debugging on kernel cmd line (506.61 KB, text/plain)
2013-11-19 13:58 EST, Clyde E. Kunkel
no flags Details
systemd logs (41.21 KB, application/octet-stream)
2013-11-25 09:42 EST, Peter Rajnoha
no flags Details
lsblk per request in Comment 23 (7.44 KB, text/plain)
2013-11-25 13:23 EST, Clyde E. Kunkel
no flags Details
journalctl -xb per request in Comment 22 (445.98 KB, text/plain)
2013-11-25 13:25 EST, Clyde E. Kunkel
no flags Details
udevadm info /dev/md127 (1.48 KB, text/plain)
2013-11-26 14:56 EST, Clyde E. Kunkel
no flags Details
systemctl show /dev/md127 (821 bytes, text/plain)
2013-11-26 14:58 EST, Clyde E. Kunkel
no flags Details
journalctl -b (449.14 KB, text/plain)
2013-11-26 14:59 EST, Clyde E. Kunkel
no flags Details
patch for udev rules which fixes the issue w/o change to systemd (1.46 KB, patch)
2013-12-10 15:06 EST, Zbigniew Jędrzejewski-Szmek
no flags Details | Diff
journalctl -b output (215.45 KB, text/plain)
2014-01-08 09:29 EST, Fabrice Allibe
no flags Details
journalctl -b output with lvm2-2.02.103-5.1 installed (204.00 KB, text/plain)
2014-01-08 16:00 EST, Fabrice Allibe
no flags Details

  None (edit)
Description Peter Rajnoha 2013-11-05 09:42:56 EST
This started in https://bugzilla.redhat.com/show_bug.cgi?id=1023250#c28 and further, creating a new bz as this problem is actually different that the one originally reported in bz 1023250.

Continuing here...

...the actual thing here is that the lvm scan on that device never happens

I've finally managed to reproduce it, I tried to restart several times and it's not 100% reproducible - it seems it's some race. I've two md arrays in my test system, and sometimes one is not processed correctly, sometimes both and sometimes it's just OK, for example in a test run, I've hit this situation where one MD device is processed correctly (9:127) and the other one not (9:0):

[0] f20/~ # systemctl -a | grep lvm2-pvscan
lvm2-pvscan@9:0.service                                                                                        loaded inactive dead      LVM2 PV scan on device 9:0
lvm2-pvscan@9:127.service                                                                                      loaded active   exited    LVM2 PV scan on device 9:127

I've tried to enable systemd debug logging, and I can see:

Nov 04 12:55:33 f20.virt systemd[1]: dev-block-9:127.device changed dead -> plugged
Nov 04 12:55:33 f20.virt systemd[1]: sys-devices-virtual-block-md127.device changed dead -> plugged
Nov 04 12:55:33 f20.virt systemd[1]: Trying to enqueue job lvm2-pvscan@9:127.service/start/fail
Nov 04 12:55:33 f20.virt systemd[1]: Installed new job lvm2-pvscan@9:127.service/start as 351
Nov 04 12:55:33 f20.virt systemd[1]: Enqueued job lvm2-pvscan@9:127.service/start as 351
Nov 04 12:55:33 f20.virt systemd[1]: Starting LVM2 PV scan on device 9:127...
Nov 04 12:55:33 f20.virt systemd[1]: About to execute: /usr/sbin/pvscan --cache --activate ay /dev/block/9:127

Which is for the pvscan on 9:127 MD device and which for which the pvscan was 
triggered and everything worked fine.


However, for the other MD device (the 9:0 one), there's only this info in the systemd log:

Nov 04 12:55:33 f20.virt systemd[1]: dev-block-9:0.device changed dead -> plugged

And nothing else! Now, the udev information for that device is (selected relevant items only):

[0] f20/~ # udevadm info --name=md0
E: MAJOR=9
E: MINOR=0
E: SYSTEMD_ALIAS=/dev/block/9:0
E: SYSTEMD_WANTS=lvm2-pvscan@9:0.service
E: TAGS=:systemd:

So from this side, everything seems to be correct. The only thing is that the lvm2-pvscan@9:0.service was not started for some reason.

CC-ing Michal and Lennart - my question is, why the service is not started and it's kept inactive? Any idea what could be wrong here?

The lvm2-pvscan@.service is defined as:

[0] f20//lib/systemd/system # cat lvm2-pvscan@.service 
[Unit]
Description=LVM2 PV scan on device %i
Documentation=man:pvscan(8)
DefaultDependencies=no
BindsTo=dev-block-%i.device
After=lvm2-lvmetad.socket
Before=shutdown.target
Conflicts=shutdown.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/sbin/pvscan --cache --activate ay /dev/block/%i
ExecStop=/usr/sbin/pvscan --cache %i
Comment 1 Peter Rajnoha 2013-11-05 09:45:12 EST
Also, running systemctl start/restart on the instantiated inactive lvm2-pvscan@9:0.service starts it correctly and puts it into active state...
Comment 2 Michal Schmidt 2013-11-05 10:56:04 EST
Please attach the output of "systemctl show dev-block-9:0.device dev-block-9:127.device".
Comment 3 Clyde E. Kunkel 2013-11-05 11:31:42 EST
Created attachment 819884 [details]
output of systemctl show dev-block-9:0 and 9:127.device
Comment 4 Clyde E. Kunkel 2013-11-05 20:31:34 EST
Created attachment 820103 [details]
output from journalctl -xb

Issue still present in F20 beta RC3.  Attached is journalctl output before restarting pvscan service.
Comment 5 Clyde E. Kunkel 2013-11-07 14:52:43 EST
Created attachment 821290 [details]
journalctl -xb output

Issue still exists in F20 Beta RC5.  FWIW, attached complete journalctl log.  Attempts to enhance log information using   systemd.log_level=debug systemd.log_target=journal-or-kmsg from comment 35 in 1023250 causes system to hang at switch root.
Comment 6 Clyde E. Kunkel 2013-11-10 12:28:26 EST
All of the problems I have encountered with this version of LVM2 occur on an installation to single btrfs / subvol on a gpt drive.  The system boots normally until in the fstab, mounts to LVs on a software raid 10 PV are added. Reverting lvm2 to the previous version fixes the problem. 

If RC5 is installed instead to a new LV on the raid 10 PV, totally ignoring the drive with the btrfs filesystems, normal startup occurs.

Should this bz be a final blocker?
Comment 7 Mike Ruckman 2013-11-14 12:40:05 EST
Discussed in 2013-11-14 Blocker Review Meeting [1]. We will wait until the developers can provide more information about the cause and impact of this bug, so we can better decide whether this should block the release.

[1] http://meetbot.fedoraproject.org/fedora-blocker-review/2013-11-14/
Comment 8 Peter Rajnoha 2013-11-15 02:49:27 EST
So far, we've hit this problem in "LVM on top of MD" configuration. The outcome is that system could end up waiting for the device to be activated for it to be mounted if there's any mountpoint on it (most notably the separate "/home" mount point which is usually separate from "/" mount point). And this, ends up with unusable system then, unfortunately. The volumes can be activated manually, of course, but that's not acceptable in most cases...

On my test system, this is reproducible in 50% of the time, in original reporter's case, this seems 100% reproducible (bug #1023250 comment #28 and further).
Comment 9 Michal Schmidt 2013-11-15 05:43:48 EST
The udev rule file /usr/lib/udev/rules.d/69-dm-lvm-metad.rules contains the following rule:
    ENV{SYSTEMD_WANTS}="lvm2-pvscan@$major:$minor.service"
The rule may or may not be applied depending on conditions tested earlier in the rule file.
The udev rule file is present also in the initramfs.

The template unit file /usr/lib/systemd/system/lvm2-pvscan@.service is, however, not included in the initramfs.

Hypothesis: What if the device is process by udev and systemd early, while we're still in the initramfs? Obviously the service cannot be run then because the unit file is not known yet. The device unit will still become "plugged" at that point, because its dependency on the service is of type "Wants", so a failure to fulfil the dependency will not block it. Later when we switch to the root filesystem, there is no mechanism that would re-try the start of the service.
Comment 10 Peter Rajnoha 2013-11-15 06:16:10 EST
I've tried rd.break=pre-udev and then I modified the 69-dm-lvm-metad.rules in initramfs directly and removed these three lines:

 83 ENV{SYSTEMD_ALIAS}="/dev/block/$major:$minor"
 84 ENV{ID_MODEL}="LVM PV $env{ID_FS_UUID_ENC} on /dev/$name"
 85 ENV{SYSTEMD_WANTS}="lvm2-pvscan@$major:$minor.service"

But with the same result. So, unfortunately, this is not the problem.

(But yes, we could probably remove those three lines as part of dracut setup as they're not usable at all without lvmetad running - and it's not running in initramfs. So that would be a little cleanup, but it doesn't solve the original problem still...).
Comment 11 Peter Rajnoha 2013-11-15 06:17:56 EST
Interesting thing is that whenever I add "debug" to kernel cmd line, I can't reproduce - must be some race, but it's very hard to track down what's actually wrong.
Comment 12 Clyde E. Kunkel 2013-11-17 18:49:24 EST
From my admittedly non-professional view of this problem, it is caused by a change in LVM2.  Alpha and early beta version of F20, before the LVM2 upgrade worked fine.  No problems.

Just for fun, I installed lvm2-2.02.104-1.fc21 to see what happened.  It fails also.  Attaching the complete journalctl -xb output in case there is any useful information for you gurus.
Comment 13 Clyde E. Kunkel 2013-11-17 18:51:16 EST
Created attachment 825336 [details]
journalctl -xb of lvm2-2.02.104-1.fc21
Comment 14 Peter Rajnoha 2013-11-18 03:14:44 EST
(In reply to Clyde E. Kunkel from comment #12)
> From my admittedly non-professional view of this problem, it is caused by a
> change in LVM2.  Alpha and early beta version of F20, before the LVM2
> upgrade worked fine.  No problems.
> 

Sure, because we needed to change the way the lvm2 autoactivation is done - we need to do that as a service instantiated from udev as otherwise any process forked off the udevd directly could be killed prematurely - that was actually another bug #1011863 (and associated fix https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=546db1c4be55d47d41082a345e15a35055223154).

Thing here is that we can't see a reason why this new scheme fails sometimes in a racy way while it should work. The reason here is that the service is kept inactive - it's instantiated correctly, though not executed for some reason.
Comment 15 Peter Rajnoha 2013-11-18 03:31:05 EST
(In reply to Clyde E. Kunkel from comment #13)
> Created attachment 825336 [details]
> journalctl -xb of lvm2-2.02.104-1.fc21

That may help a bit, thanks! (Though I'm away at the moment, will be back on Thursday only to have a more closer look at the log).
Comment 16 Michal Sekletar 2013-11-18 09:52:36 EST
I am *not* an LVM expert, but from the discussion on IRC I understood that if auto activation of LVs is used, which is afaik our case, then lvmetad should be running even though it is socket activated. As mentioned previously this is not the case when service is triggered by udev rule before switchroot.  Please correct me if I am mistaken and move back to systemd, but in the meantime rerouting to the most appropriate component which is imho lvm2.
Comment 17 Clyde E. Kunkel 2013-11-19 13:58:08 EST
Created attachment 826242 [details]
journalctl -xb with lots of debugging on kernel cmd line

Saw this in the attached file:
Nov 19 13:46:47 P5K-EWIFI.home pvscan[663]: Update of lvmetad failed. This is a serious problem.
Nov 19 13:46:47 P5K-EWIFI.home pvscan[667]: Update of lvmetad failed. This is a serious problem.

The attached file is journalctl up to time of failure.

This version of LVM2 is the 103-3 version currently in F20.
Comment 18 Adam Williamson 2013-11-20 13:17:00 EST
Discussed at 2013-11-20 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-11-20/f20-blocker-review.2013-11-20-17.00.log.txt . This looks worrying, but at the same time, we know it doesn't hit all LVM-on-RAID installs, so we're not yet sure whether it's widespread enough to merit blocker status. Peter, LVM devs - it'd help a lot if we could get some information / educated guesses on how common this issue is likely to be.
Comment 19 Clyde E. Kunkel 2013-11-23 14:13:17 EST
Hi, any progress in fixing this?

TIA
Comment 20 Peter Rajnoha 2013-11-25 03:34:12 EST
(In reply to Clyde E. Kunkel from comment #19)
> Hi, any progress in fixing this?
> 
> TIA

...I've just returned back today, I'll have a more closer look at the log finally and I'll let you informed about any more findings/fixes. This bug is on my priority list at the moment.
Comment 21 Peter Rajnoha 2013-11-25 05:50:39 EST
(In reply to Michal Sekletar from comment #16)
> I am *not* an LVM expert, but from the discussion on IRC I understood that
> if auto activation of LVs is used, which is afaik our case, then lvmetad
> should be running even though it is socket activated. As mentioned
> previously this is not the case when service is triggered by udev rule
> before switchroot.  Please correct me if I am mistaken and move back to
> systemd, but in the meantime rerouting to the most appropriate component
> which is imho lvm2.

Well, based on the log:

2902 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: initrd-switch-root.service changed start -> dead
2903 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Failed to destroy cgroup /system.slice/initrd-switch-root.service: No such file or directory
2904 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Job initrd-switch-root.service/stop finished, result=done
2905 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopped Switch Root.
2906 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopping Switch Root.
2907 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: initrd-switch-root.target changed active -> dead
2908 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Job initrd-switch-root.target/stop finished, result=done
2909 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopped target Switch Root.
2910 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopping Initrd File Systems.
2911 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: initrd-fs.target changed active -> dead
2912 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Job initrd-fs.target/stop finished, result=done
2913 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopped target Initrd File Systems.
2914 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopping Initrd Root File System.
2915 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: initrd-root-fs.target changed active -> dead
2916 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Job initrd-root-fs.target/stop finished, result=done
2917 Nov 19 13:37:27 P5K-EWIFI.home systemd[1]: Stopped target Initrd Root File System.

...

3493 Nov 19 13:37:28 P5K-EWIFI.home systemd[1]: Trying to enqueue job lvm2-pvscan@8:34.service/start/fail
3494 Nov 19 13:37:28 P5K-EWIFI.home systemd[1]: Installed new job lvm2-pvscan@8:34.service/start as 256
3495 Nov 19 13:37:28 P5K-EWIFI.home systemd[1]: Enqueued job lvm2-pvscan@8:34.service/start as 256
3496 Nov 19 13:37:28 P5K-EWIFI.home kernel: Linux video capture interface: v2.00
3497 Nov 19 13:37:28 P5K-EWIFI.home systemd[1]: Starting LVM2 PV scan on device 8:34...
3498 Nov 19 13:37:28 P5K-EWIFI.home kernel: md: bind<sdc3>
3499 Nov 19 13:37:28 P5K-EWIFI.home systemd[1]: About to execute: /usr/sbin/pvscan --cache --activate ay /dev/block/8:34


The very first lvm2-pvscan service is run after switch-root.

Also, the MD device's device unit appears plugges *after* switch-root as well:

4307 Nov 19 13:37:35 P5K-EWIFI.home systemd[1]: dev-block-9:127.device changed dead -> plugged

Though, I can't see any "lvm2-pvscan@9:127" being instantiated - and no reason logged.
Comment 22 Peter Rajnoha 2013-11-25 05:57:44 EST
(In reply to Clyde E. Kunkel from comment #17)
> Created attachment 826242 [details]
> journalctl -xb with lots of debugging on kernel cmd line
> 
> Saw this in the attached file:
> Nov 19 13:46:47 P5K-EWIFI.home pvscan[663]: Update of lvmetad failed. This
> is a serious problem.
> Nov 19 13:46:47 P5K-EWIFI.home pvscan[667]: Update of lvmetad failed. This
> is a serious problem.
> 
> The attached file is journalctl up to time of failure.
> 
> This version of LVM2 is the 103-3 version currently in F20.

Could you please attach full log (including those failures mentioned above)? The attached log is only till:

Nov 19 13:39:26 P5K-EWIFI.home systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local

From the previous log from Nov 07, I can see that the lvm2-lvmetad.service is stopped which it shouldn't be!

Nov 07 14:26:16 P5K-EWIFI.home systemd[1]: Stopping LVM2 metadata daemon...

Which could end up with the pvscan errors found as it can't communicate with lvmetad anymore.

But let's have a fresh full log, please. Thanks!
Comment 23 Peter Rajnoha 2013-11-25 05:59:54 EST
Also, could you please attach the output of "lsblk" command to see the whole tree with base devices + MD.
Comment 24 Peter Rajnoha 2013-11-25 06:02:19 EST
(...do the lsblk call as well as journalctl -xb call at the same boot time so we can match them)
Comment 25 Peter Rajnoha 2013-11-25 09:42:50 EST
Created attachment 828712 [details]
systemd logs

So I've discussed this with systemd guys directly, showed them the failing system and it seems this is indeed a systemd issue, so changing the BZ back to systemd component. I'm attaching some logs to help them to debug. If there's anything needed, let me know, please.

I still need to see the logs from Clyde to investigate the "Update of lvmetad failed." thing seen in the logs from comment #17 - but that seems to be a different issue, don't know if it's related yet or not to the problem I can reproduce myself and which causes the lvm2-pvscan@.service not being run by systemd...

Anyway, my logs are attached for systemd guys to investigate, if there's anything else you need, just let me know...
Comment 26 Václav Pavlín 2013-11-25 09:58:19 EST
Kay, could you please take a look at this issue? We think it's some race in udev - all units are created, but lvm2-pvscan@$major:$minor is not started (sometimes).
Comment 27 Michal Sekletar 2013-11-25 12:18:58 EST
I'd like to do some debugging with modified version of systemd and I want to be sure I have exactly the same setup. Can you please provision some machine in beaker with the kickstart you have from your installation?
Comment 28 Clyde E. Kunkel 2013-11-25 13:23:07 EST
Created attachment 828793 [details]
lsblk per request in Comment 23

Taken after systemd entered emergency shell and before rescanning /dev/md127
Comment 29 Clyde E. Kunkel 2013-11-25 13:25:05 EST
Created attachment 828798 [details]
journalctl -xb per request in Comment 22

Taken after systemd entered emergency shell and before # systemctl start lvm2-pvscan@9:127
Comment 30 Peter Rajnoha 2013-11-26 03:11:51 EST
(In reply to Michal Sekletar from comment #27)
> I'd like to do some debugging with modified version of systemd and I want to
> be sure I have exactly the same setup. Can you please provision some machine
> in beaker with the kickstart you have from your installation?

...well, I don't have any special kickstart - it's just usual F20 installation, then after installation I added some disks and then I created LVM on MD RAID, for example:

  mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb
  pvcreate /dev/md0
  vgcreate vg /dev/md0
  lvcreate -l1 vg
Comment 31 Peter Rajnoha 2013-11-26 04:22:47 EST
(In reply to Clyde E. Kunkel from comment #29)
> Created attachment 828798 [details]
> journalctl -xb per request in Comment 22
> 
> Taken after systemd entered emergency shell and before # systemctl start
> lvm2-pvscan@9:127

...this log does not exhibit the "Update of lvmetad failed." anymore. Then it's exactly the same problem I can reproduce and which is commented/logged in comment #25.

From your log:

Nov 25 08:13:27 P5K-EWIFI systemd[1]: dev-block-9:127.device changed dead -> plugged

But the lvm2-pvscan@9:127.service is instantiated, but not activated by systemd - there's no line like "Trying to enqueue job lvm2-pvscan@9:127.service" which is seen for other (non-MD) PVs. The same problem I can reproduce then.
Comment 32 Lennart Poettering 2013-11-26 11:24:32 EST
Please provide the following consistent data of a device that didn't get pvscan started like this:

udevadm info /dev/....
systemctl show /dev/...
journalctl -b

Note that we explicitly exclude /dev/block/xx:yy and /dev/char/xx:yy from getting picked up by systemd. It's wrong taping over that.
Comment 33 Clyde E. Kunkel 2013-11-26 14:56:40 EST
Created attachment 829446 [details]
udevadm info /dev/md127

udevadm info /dev/md127 taken at time of systemd dropping into shell and before systemctl restart lvm2-pvscan@9:127
Comment 34 Clyde E. Kunkel 2013-11-26 14:58:13 EST
Created attachment 829448 [details]
systemctl show /dev/md127

systemctl show /dev/md127 taken at time of systemd dropping into shell and before systemctl restart lvm2-pvscan@9:127
Comment 35 Clyde E. Kunkel 2013-11-26 14:59:32 EST
Created attachment 829449 [details]
journalctl -b

journalctl -b taken at time of systemd dropping into shell and before systemctl restart lvm2-pvscan@9:127
Comment 36 Lennart Poettering 2013-11-26 17:35:55 EST
Hmm, please also provide the "systemctl status" output of "sys-devices-virtual-block-md127.device", i.t. the device unit named after the sysfs path of the device.
Comment 37 Clyde E. Kunkel 2013-11-26 21:09:02 EST
(In reply to Lennart Poettering from comment #36)
> Hmm, please also provide the "systemctl status" output of
> "sys-devices-virtual-block-md127.device", i.t. the device unit named after
> the sysfs path of the device.

sys-devices-virtual-block-md127.device - LVM PV M1iC6T-Xyg3-rQ22-D0Md-EZ8W-5qWx-WBDUkE on /dev/md127
   Loaded: loaded
   Active: active (plugged) since Tue 2013-11-26 16:01:53 EST; 2min 47s ago
   Device: /sys/devices/virtual/block/md127
Comment 38 Peter Rajnoha 2013-11-27 04:11:06 EST
(In reply to Lennart Poettering from comment #32)
> Note that we explicitly exclude /dev/block/xx:yy and /dev/char/xx:yy from
> getting picked up by systemd. It's wrong taping over that.

Hmm, what's wrong with /dev/block/xx:yy?
Comment 39 Lennart Poettering 2013-11-27 08:51:21 EST
Clyde, sorry, I meant "systemctl show sys-devices-virtual-block-md127.device".
Comment 40 Lennart Poettering 2013-11-27 08:53:33 EST
(In reply to Peter Rajnoha from comment #38)
> (In reply to Lennart Poettering from comment #32)
> > Note that we explicitly exclude /dev/block/xx:yy and /dev/char/xx:yy from
> > getting picked up by systemd. It's wrong taping over that.
> 
> Hmm, what's wrong with /dev/block/xx:yy?

Major/minor suck as identifiers. Use the kernel device name+subsystem name or anything else. The major/minor links aren't even exposes in libudev as valid symlinks.
Comment 41 Peter Rajnoha 2013-11-27 09:01:36 EST
(In reply to Lennart Poettering from comment #40)
> (In reply to Peter Rajnoha from comment #38)
> > (In reply to Lennart Poettering from comment #32)
> > > Note that we explicitly exclude /dev/block/xx:yy and /dev/char/xx:yy from
> > > getting picked up by systemd. It's wrong taping over that.
> > 
> > Hmm, what's wrong with /dev/block/xx:yy?
> 
> Major/minor suck as identifiers. Use the kernel device name+subsystem name
> or anything else. The major/minor links aren't even exposes in libudev as
> valid symlinks.

Actually I used dm-X (the $devname - kernel name) at first, but then I hit the problem with the escaping :) See bug #1020887. So I can't use that unfortunately...

The only things I have in udev is $devname and $major/$minor. The symlinks in udev is stored as a list in $links which is unusable here, of course. So the only remaining thing to use was $major:$minor.

If /dev/block/major:minor is not valid, then let's not expose that in /dev at all...
Comment 42 Peter Rajnoha 2013-11-27 09:03:37 EST
(In reply to Lennart Poettering from comment #40)
> (In reply to Peter Rajnoha from comment #38)
> > (In reply to Lennart Poettering from comment #32)
> > > Note that we explicitly exclude /dev/block/xx:yy and /dev/char/xx:yy from
> > > getting picked up by systemd. It's wrong taping over that.
> > 
> > Hmm, what's wrong with /dev/block/xx:yy?
> 
> Major/minor suck as identifiers. Use the kernel device name+subsystem name
> or anything else. The major/minor links aren't even exposes in libudev as
> valid symlinks.

Where is this documented that /dev/block/xx:yy sucks?
Comment 43 Peter Rajnoha 2013-11-27 09:22:50 EST
Also, the interesting thing here is that it works sometimes - there's just a race of some kind there. So it's not that /dev/block/xx:yy would not work completely.
Comment 44 Clyde E. Kunkel 2013-11-27 09:53:08 EST
(In reply to Lennart Poettering from comment #39)
> Clyde, sorry, I meant "systemctl show
> sys-devices-virtual-block-md127.device".

Id=sys-devices-virtual-block-md127.device
Names=sys-devices-virtual-block-md127.device
Wants=lvm2-pvscan@9:127.service
Description=LVM PV M1iC6T-Xyg3-rQ22-D0Md-EZ8W-5qWx-WBDUkE on /dev/md127
LoadState=loaded
ActiveState=active
SubState=plugged
InactiveExitTimestamp=Wed 2013-11-27 04:42:14 EST
InactiveExitTimestampMonotonic=16828251
ActiveEnterTimestamp=Wed 2013-11-27 04:42:14 EST
ActiveEnterTimestampMonotonic=16828251
ActiveExitTimestampMonotonic=0
InactiveEnterTimestampMonotonic=0
CanStart=no
CanStop=no
CanReload=no
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureIsolate=no
IgnoreOnIsolate=yes
IgnoreOnSnapshot=yes
NeedDaemonReload=no
JobTimeoutUSec=1min 30s
ConditionTimestampMonotonic=0
ConditionResult=no
Transient=no
SysFSPath=/sys/devices/virtual/block/md127
Comment 45 Adam Williamson 2013-11-27 13:50:25 EST
Discussed at 2013-11-27 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-11-27/f20-blocker-review-3.2013-11-27-17.01.log.txt . It's still a bit difficult to tell how serious of an issue this is, but it looks like it's affecting a large proportion of LVM-on-RAID installs. That's a supported configuration (and indeed it's what anaconda will give you by default for an install to Intel firmware RAID), so this is accepted as a blocker per Alpha criterion "A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility." in the case where you do an LVM-on-RAID install.

If anyone has a reason this should not be considered a blocker, please advise and we can re-consider it.
Comment 46 Peter Rajnoha 2013-11-29 10:09:51 EST
(In reply to Lennart Poettering from comment #40)
> Major/minor suck as identifiers. Use the kernel device name+subsystem name
> or anything else. The major/minor links aren't even exposes in libudev as
> valid symlinks.

I've tried the $devnode from udev (MD doesn't have "-" in its name, so we don't run into a siatuation like described in bug #1020887). Unfortunately, the same result as while using $major:$minor - the service is instantiated, but stays inactive and no log entry around...
Comment 47 Clyde E. Kunkel 2013-12-03 16:34:44 EST
F20 TC4 still exhibits this issue.  Thx.
Comment 48 Peter Rajnoha 2013-12-04 06:32:26 EST
Lennart, Kay, Lukas, Michal or anyone else from systemd team - would it be possible to provide a temporary test build with more debug messages in for the code that handles the SYSTEMD_WANTS udev rule and then the way the systemd service is instantiated/run based on that - I think if we added more debugging info in these parts of the code, we should be able to track where the source of the problem is - just some printf of the state in various steps etc.

I don't see into systemd code in so much detail so it would be better if you could do the test build with those extra debug messages, thanks!
Comment 49 Peter Rajnoha 2013-12-04 06:50:04 EST
As for LVM, we can't go back - we can't disable lvmetad use+autoactivation and also we can't go to the older and inappropriate mechanism of running the PV scan where we run pvscan in background from udev processing as that could cause the pvscan to be killed when uevent processing finishes, also from udev man page:

RUN{type}
  ...the forked processes, detached or not, will be unconditionally killed after the event handling has finished.

IOW, we need to run a background job based on uevent from udev - the SYSTEMD_WANTS fits that perfectly with its service instantiation, but we need to track down why systemd does instantiate the service, but it does not execute it.

We need to debug the SYSTEMD_WANTS functionality!
Comment 50 Clyde E. Kunkel 2013-12-04 12:00:39 EST
FWIW, updated systemd to systemd-208-7.fc20.x86_64.  No luck.
Comment 51 Zbigniew Jędrzejewski-Szmek 2013-12-05 01:26:32 EST
Please test with this scratch build:

http://koji.fedoraproject.org/koji/taskinfo?taskID=6258581

It contains a crude fix, which generates some warning messages, but seems to fix the issue in my test rig. If this helps, then I'd prepare a proper update tomorrow (meaning still on Thursday :)).
Comment 52 Peter Rajnoha 2013-12-05 03:45:38 EST
(In reply to Zbigniew Jędrzejewski-Szmek from comment #51)
> Please test with this scratch build:
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=6258581
> 

OK, I've rebooted the machine about 20 times where this was clearly reproducible before and I haven't hit the issue yet - seems it's gone!
Comment 53 Peter Rajnoha 2013-12-05 03:49:56 EST
Clyde, can you confirm?
Comment 54 Jaroslav Reznik 2013-12-05 07:16:58 EST
Perfect! Would be really nice to hear from Clyde to confirm fix too but to speed up the process, I'd step forward and do the proper build asap.
Comment 55 Clyde E. Kunkel 2013-12-05 09:56:29 EST
Let me get some coffee in me.  Getting test rig setup now.  Standby.
Comment 56 Clyde E. Kunkel 2013-12-05 10:39:02 EST
OK, 3 successful reboots.  And considering I was hitting this 100% of the time I would say fixed.  

Thank you very much lvm and systemd folks.

BTW, would you mind telling me the fix?  Thanks.
Comment 57 Fedora Update System 2013-12-05 17:39:43 EST
systemd-208-9.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/FEDORA-2013-22704/systemd-208-9.fc20
Comment 58 Zbigniew Jędrzejewski-Szmek 2013-12-05 17:56:55 EST
(In reply to Clyde E. Kunkel from comment #56)
> OK, 3 successful reboots.  And considering I was hitting this 100% of the
> time I would say fixed.  
> 
> Thank you very much lvm and systemd folks.
> 
> BTW, would you mind telling me the fix?  Thanks.
It's http://kawka.in.waw.pl/git/systemd/commit/043a559ff37 and http://kawka.in.waw.pl/git/systemd/commit/bd819c6a6. Basically, the units specified in SYSTEMD_WANTS would only be started if they were loaded before. So in case of those pvscan@.service's, systemd would attempt to load some of them in the initramfs, and fail, because of https://#c9. Then the system instance would inherit the stub during switch-root and actually successfully load the unit from disk, and when it would detect the device, those units would be run successfully. OTOH, when the device wasn't detected in the initramfs, the error in https://#c9 wouldn't manifest, and those units would be loaded, but not started.

The second patch fixes the issue that now also systemd --user instances would try to load lvm2-pvscan@.service's, resulting in an error message in the logs. The fix in mainline git is a bit different, because Lennart reworked some of the code recently, and the second patch is not needed.

It'd would be great if you could test the new update and provide karma.
Comment 59 Zbigniew Jędrzejewski-Szmek 2013-12-05 18:16:06 EST
Comment #9 deserves further investigation.
Comment 60 Fedora Update System 2013-12-06 00:56:26 EST
systemd-208-9.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 61 Peter Rajnoha 2013-12-06 03:15:29 EST
(In reply to Zbigniew Jędrzejewski-Szmek from comment #59)
> Comment #9 deserves further investigation.

There's a new patch for dracut to exclude the lines in initramfs that set SYSTEMD_WANTS:

http://git.kernel.org/cgit/boot/dracut/dracut.git/commit/?id=297b8edc050715d563d54cfa34e42bf5909b9a7b

But when I tried this, the error was still reproducible!
Comment 62 Peter Rajnoha 2013-12-06 03:23:12 EST
(In reply to Peter Rajnoha from comment #61)
> (In reply to Zbigniew Jędrzejewski-Szmek from comment #59)
> > Comment #9 deserves further investigation.
> 
> There's a new patch for dracut to exclude the lines in initramfs that set
> SYSTEMD_WANTS:
> 
> http://git.kernel.org/cgit/boot/dracut/dracut.git/commit/
> ?id=297b8edc050715d563d54cfa34e42bf5909b9a7b
> 
> But when I tried this, the error was still reproducible!

Also, the MD devices were activate after pivot to root fs, so dracut did not even had those MD devices present...
Comment 63 Andy Wang 2013-12-06 10:02:55 EST
*** Bug 1038917 has been marked as a duplicate of this bug. ***
Comment 64 Adam Williamson 2013-12-09 14:46:17 EST
prajnoha: we're a little confused about exactly what the implications of c#62 are. Are you saying this bug is not actually fixed, or is that just some related issue in dracut? If there is a bug outstanding, is it one we need to consider blocking F20 on? Thanks!
Comment 65 Shawn Starr 2013-12-09 15:03:21 EST
I am using latest dracut and my issue is fixed:

http://koji.fedoraproject.org/koji/buildinfo?buildID=483057

Grab the respective dracut RPMs you have and dnf/yum update them.

Rebuild your initramfs with mkinitrd/dracut and try it.
Comment 66 Peter Rajnoha 2013-12-10 03:57:01 EST
(In reply to Adam Williamson from comment #64)
> prajnoha: we're a little confused about exactly what the implications of
> c#62 are. Are you saying this bug is not actually fixed, or is that just
> some related issue in dracut? If there is a bug outstanding, is it one we
> need to consider blocking F20 on? Thanks!

The thing with dracut is a different issue - bug #1038838. That bug was supposed to be an issue if there's an LVM VG on MD and this VG contains the LV with root fs on it besides any other LV with a mount point that is mounted after initramfs. But the problem was actually never reproduced. For example I tested this layout:

[0] f20/~ # lsblk /dev/vdb /dev/vdc
NAME                  MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
vdb                   252:16   0   11G  0 disk  
`-md0                   9:0    0   11G  0 raid1 
  |-fedora_md-root    253:1    0    9G  0 lvm   /
  |-fedora_md-swap    253:2    0  256M  0 lvm   
  `-fedora_md-test_lv 253:4    0  1.8G  0 lvm   /mnt/test
vdc                   252:32   0   11G  0 disk  
`-md0                   9:0    0   11G  0 raid1 
  |-fedora_md-root    253:1    0    9G  0 lvm   /
  |-fedora_md-swap    253:2    0  256M  0 lvm   
  `-fedora_md-test_lv 253:4    0  1.8G  0 lvm   /mnt/test

Which means that root fs is on fedora_md-root LV - the md0 as well as the root LV is activated in initramfs. The fedora_md-test_lv is activated later - this was supposed to make problems - though it *does not* actually. I've tested this with dracut-034-19.git20131021.fc20.x86_64 (which does not have the patch from bug #1038838 yet!) So the fix for bug #1038838 is for extra safety only, I'd say. I'm not sure if anyone hit that problem in real at all, I can't reproduce that myself. Comment #65 from Shawn proves that the latest dracut fixes the issue - but I haven't inspected his problem (this is his first post here) - so I'm not sure if Shawn is talking about the same problem as originally reported here in this bug or not.

Anyway, the patch for bug #1038838 is included within this update already:

https://admin.fedoraproject.org/updates/FEDORA-2013-22831/dracut-034-64.git20131205.fc20

(though it does not reference the bug #1038838 :))

So, in summary, we need to make sure we have these in f20 final:

  systemd >= 208-9.fc20
  dracut >= 034-62.git20131205.fc20
Comment 67 Lennart Poettering 2013-12-10 14:01:51 EST
I have reverted the upstream patch now. If the dependency logic doesn't work correctly we shouldn't overtape that by queuing independent jobs here. We really should fix the dependency logic here instead.

Hmm, I have the suspicion that the devices in question simply don't set the SYSTEMD_READY=0 property in the right cases.

Note that systemd will only act on SYSTEMD_WANTS when the device first shows up. It will not act on it if the device already exists and just gets a "changed" udev event. That's intended that way.

There are some devices which when first showing up are not fully set up. By setting SYSTEMD_READY=0 on them udev rules can tell systemd to not pick them up just yet. Or in other words: in systemd's eyes a device which as SYSTEMD_READY=0 set is treated as if it didn't exist at all. Only devices with either SYSTEMD_READY=1 or no SYSTEM_READY property at all are considered present.

For LVM this means that it should set SYSTEMD_READY=0 as long as the device is not fully set up. Only when that's done it should set SYSTEMD_READY=1 (or unset it entirely). Only when the transition from SYSTEMD_READY=0 → SYSTEMD_READY=1 happens systemd will look at SYSTEMD_WANTS. Hence make sure to set that during the same change event.
Comment 68 Lennart Poettering 2013-12-10 14:04:29 EST
So, the question here is, what does LVM do with SYSTEMD_WANTS and SYSTEMD_READY currently, and what udev events does it generate with which properties set in which order?
Comment 69 Zbigniew Jędrzejewski-Szmek 2013-12-10 15:06:41 EST
Created attachment 834908 [details]
patch for udev rules which fixes the issue w/o change to systemd

(In reply to Lennart Poettering from comment #67)
> I have reverted the upstream patch now.
Yes, I agree that this is the correct thing to do.

It seems that a simple change in LVM fixes the problem too. Attached patch makes things work for me. If we agree that this is the right thing to do, we'll have to release systemd and lvm in a coordinated manner.
Comment 70 Zbigniew Jędrzejewski-Szmek 2013-12-10 15:09:02 EST
I'm reassigning to LVM, and lowering the priority, since we do have a temporary fix in place for F20 release.
Comment 71 Jóhann B. Guðmundsson 2013-12-10 16:06:56 EST
(In reply to Zbigniew Jędrzejewski-Szmek from comment #70)
> I'm reassigning to LVM, and lowering the priority, since we do have a
> temporary fix in place for F20 release.

With my QA hat on and based on your assessment in c69 it would be better that we release with both the updates that contain the fix in lvm and systemd in place in GA since mirrors dont care if you "release systemd and lvm in a coordinated manner." and our end users base might end up only one of the components instead of both...
Comment 72 Adam Williamson 2013-12-10 16:59:48 EST
johann: if an update is submitted that contains both packages, they should go out together. we only get problems when updates for interdependent packages are submitted separately.

we *could* change the fix for final, though, we haven't built an RC yet. lennart, zbigniew, is there any reason to try and include the 'improved' fix rather than the systemd-side workaround in f20 final, or is it ok to send the improved fix out as an update?
Comment 73 Zbigniew Jędrzejewski-Szmek 2013-12-10 18:39:48 EST
(In reply to Adam Williamson from comment #72)
> we *could* change the fix for final, though, we haven't built an RC yet.
> lennart, zbigniew, is there any reason to try and include the 'improved' fix
> rather than the systemd-side workaround in f20 final, or is it ok to send
> the improved fix out as an update?
I think that the two approaches should be equivalent.
Comment 74 Jóhann B. Guðmundsson 2013-12-10 19:14:48 EST
(In reply to Adam Williamson from comment #72)
> johann: if an update is submitted that contains both packages, they should
> go out together. we only get problems when updates for interdependent
> packages are submitted separately.

Adam; You know as well as idea there is plethora of things that can go wrong in the update process and you do realize that the affected user base will be reduced to just those that upgrade and any reporters that might have installed any of the pre-releases if we releases GA with those fixes vs those + all that perform fresh install anyway let's just discuss this on the next blocker bug meeting.
Comment 75 Adam Williamson 2013-12-10 19:19:19 EST
I'm not aware of any 'plethora of things that can go wrong' with pushing a single update containing multiple packages stable, no. I'm aware of lots of problems with multiple inter-related updates, though, and the case where there's an update for foo-1.1-1 and bar-1.1-1, and separate updates for foo-1.0-2 and bar-1.0-2. all sorts of fun can happen there. but if you keep it to a single coherent update, no, I'm not aware of any big issues we could run into.
Comment 76 Peter Rajnoha 2013-12-11 06:37:20 EST
(In reply to Lennart Poettering from comment #67)
> Note that systemd will only act on SYSTEMD_WANTS when the device first shows
> up. It will not act on it if the device already exists and just gets a
> "changed" udev event. That's intended that way.
> 
> There are some devices which when first showing up are not fully set up. By
> setting SYSTEMD_READY=0 on them udev rules can tell systemd to not pick them
> up just yet. Or in other words: in systemd's eyes a device which as
> SYSTEMD_READY=0 set is treated as if it didn't exist at all. Only devices
> with either SYSTEMD_READY=1 or no SYSTEM_READY property at all are
> considered present.
> 
> For LVM this means that it should set SYSTEMD_READY=0 as long as the device
> is not fully set up. Only when that's done it should set SYSTEMD_READY=1 (or
> unset it entirely). Only when the transition from SYSTEMD_READY=0 →
> SYSTEMD_READY=1 happens systemd will look at SYSTEMD_WANTS. Hence make sure
> to set that during the same change event.

Aha! But in that case, this is mdadm area where the SYSTEMD_READY should be set appropriately (and exactly the same applies for loop devices). For MD, the device is ready after a CHANGE event that sets up the array and completes it, not the ADD event.

Looking at the 63-md-arrays.rules, I can see that SYSTEMD_READY is already set:
   ACTION!="add|change", GOTO="md_end"
   ATTR{md/array_state}=="|clear|inactive", ENV{SYSTEMD_READY}="0",GOTO="md_end"

BUT, this has one problem which is that the ENV{SYSTEMD_READY} is assigned no matter what the event is! To properly follow the scheme:
   - device created, but not yet set up - ADD event generated
   - device set up and ready to use - CHANGE event generate

...we need to also check for the event type when checking the ATTR{md/array_state}, otherwise we're getting into a race!

Normally, SYSTEMD_READY=0 should be set on ADD event. There's only one exception when the device should be cosidered ready and that is the artificial ADD event that is generated anytime later *after* the device is set up.

So briefly:

  - device created, not yet set up - ADD event generated, SYSTEMD_READY=0
  - device set up, ready, CHANGE event generated, SYSTEMD_READY=1
  ...
  - udevadm trigger --action=add, ADD event generated, SYSTEMD_READY=1

The race here is in step 1 exactly. Since the ATTR{md/array_state} test does not care about the type of the event, it's possible that at the time when the test is done, the device has just been set up and we detect that md/array_state tells us the device is ready. But we're just processing the first ADD event!

So we should avoid this:

  - device created, ADD event generated notifying about device created and not yet set up, SYSTEMD_READY=1 (just because the udev processing is slow and asynchronous and at the time of the test, the device is already set up - but for clean solution, udev should consider this device to be ready only on processing that CHANGE event otherwise we're mixing things in non-deterministic way then)

Exactly the same applies for loop devices where there's ADD event on loop dev creation and CHANGE event on binding loop with a concrete file.

And exactly the same applies for DM/LVM devices where there's ADD event on device creation (but no tables loaded yet and so the device is unusable) and CHANGE event when the tables are loaded and the device is made usable. As for LVM/DM - we've already solved this long time ago by providing DM_UDEV_DISABLE_OTHER_RULES_FLAG for other rules to check whether the DM/LVM device is ready and can be accessed (also used to avoid touching private/temporary devices). Systemd already checks that in 99-systemd.rules:

SUBSYSTEM=="block", KERNEL!="ram*", ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}=="1", ENV{SYSTEMD_READY}="0"

We also need something similar for MD and loop to avoid such races as described above...

Anyway, for now, I'll just take the patch from comment #69 as we don't have time to integrate a better solution (which is to resolve this in the rules directly processing md and loop devices...and whatever else is there that has more complex initialization scheme). We can add that later as separate updates for particular packages then.
Comment 78 Peter Rajnoha 2013-12-11 06:42:42 EST
As for the loop device, I don't see any specific rules to include a fix for it, but I guess it could be 60-persistent-storage.rules (which is systemd/udev :) ).
Comment 79 Peter Rajnoha 2013-12-11 06:49:47 EST
IOW, we need something like this:

https://git.fedorahosted.org/cgit/lvm2.git/tree/udev/69-dm-lvm-metad.rules.in

But placed in the rules where it actually belongs.

"...special device listed below, scan only if the device is
# properly activated. These devices are not usable after an ADD event,
# but they require an extra setup and they are ready after a CHANGE event.
# Also support coldplugging with ADD event but only if the device is already
# properly activated."

The exact problem that arised here in this bug is then caused by desynchronization of the assumption of when the device is considered *usable* during event processing!
Comment 82 Peter Rajnoha 2013-12-11 07:57:29 EST
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=4c267c7286145165dfe078f77d18d194a21a2e1c

(I've removed the SYSTEMD_READY="0" for DM devs, this is already covered by DM_UDEV_DISABLE_OTHER_RULES_FLAG and systemd rules assigning the proper SYSTEMD_READY variable...)
Comment 83 Peter Rajnoha 2013-12-11 08:13:26 EST
Here's the build - lvm2-2.02.103-4.fc20 - I haven't sumbitted an update. If you want to bundle it with systemd update, someone just add it together please:

  http://koji.fedoraproject.org/koji/taskinfo?taskID=6279700
Comment 84 Lennart Poettering 2013-12-11 12:40:25 EST
BTW, in systemd git I have now updated the systemd.device(5) man page to explain this particular issue a bit.
Comment 85 Peter Rajnoha 2013-12-12 01:56:34 EST
(In reply to Lennart Poettering from comment #84)
> BTW, in systemd git I have now updated the systemd.device(5) man page to
> explain this particular issue a bit.

Thanks! I was just about to suggest that :)
Comment 86 Kamil Páral 2013-12-12 13:28:47 EST
Discussed at Go/No-Go meeting on 2013-12-12 [1]. The older systemd fix is in RC1 and it was confirmed to work. Removing blocker nomination.

[1] http://meetbot.fedoraproject.org/fedora-meeting-2/2013-12-12/
Comment 87 Fabrice Allibe 2014-01-08 09:26:27 EST
Hello,

I am facing kind of similar issue with up2date F20 (upgraded from F19), with my /home LVM Raid1 logical Volume (no md device driver used, apart for /boot) which is not self-activating correctly at boot time.

The layout is quite simple :
[root@serveur ~]# lsblk /dev/sda /dev/sdb
NAME                         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                            8:0    0   1,8T  0 disk  
├─sda1                         8:1    0   500M  0 part  
│ └─md0                        9:0    0   500M  0 raid1 /boot
└─sda2                         8:2    0   1,8T  0 part  
  ├─vg_serveur-root_rmeta_0  253:0    0    32M  0 lvm   
  │ └─vg_serveur-root        253:4    0    50G  0 lvm   /
  ├─vg_serveur-root_rimage_0 253:1    0    50G  0 lvm   
  │ └─vg_serveur-root        253:4    0    50G  0 lvm   /
  ├─vg_serveur-swap_rmeta_0  253:5    0    32M  0 lvm   
  │ └─vg_serveur-swap        253:9    0   9,7G  0 lvm   [SWAP]
  ├─vg_serveur-swap_rimage_0 253:6    0   9,7G  0 lvm   
  │ └─vg_serveur-swap        253:9    0   9,7G  0 lvm   [SWAP]
  ├─vg_serveur-home_rmeta_0  253:10   0    32M  0 lvm   
  │ └─vg_serveur-home        253:14   0 405,4G  0 lvm   /home
  └─vg_serveur-home_rimage_0 253:11   0 405,4G  0 lvm   
    └─vg_serveur-home        253:14   0 405,4G  0 lvm   /home
sdb                            8:16   0   1,8T  0 disk  
├─sdb1                         8:17   0   500M  0 part  
│ └─md0                        9:0    0   500M  0 raid1 /boot
└─sdb2                         8:18   0   1,8T  0 part  
  ├─vg_serveur-root_rmeta_1  253:2    0    32M  0 lvm   
  │ └─vg_serveur-root        253:4    0    50G  0 lvm   /
  ├─vg_serveur-root_rimage_1 253:3    0    50G  0 lvm   
  │ └─vg_serveur-root        253:4    0    50G  0 lvm   /
  ├─vg_serveur-swap_rmeta_1  253:7    0    32M  0 lvm   
  │ └─vg_serveur-swap        253:9    0   9,7G  0 lvm   [SWAP]
  ├─vg_serveur-swap_rimage_1 253:8    0   9,7G  0 lvm   
  │ └─vg_serveur-swap        253:9    0   9,7G  0 lvm   [SWAP]
  ├─vg_serveur-home_rmeta_1  253:12   0    32M  0 lvm   
  │ └─vg_serveur-home        253:14   0 405,4G  0 lvm   /home
  └─vg_serveur-home_rimage_1 253:13   0 405,4G  0 lvm   
    └─vg_serveur-home        253:14   0 405,4G  0 lvm   /home


When booting, it ends up in emergency mode, and I just have to activate the 'vg_serveur-home' logical volume in order to unlock the situation :

Emergency mod, /home is not mounted :
[root@serveur ~]# df
Sys. de fichiers            blocks de 1K   Utilisé Disponible Uti% Monté sur
/dev/mapper/vg_serveur-root     51475068  35874936   12962308  74% /
devtmpfs                         4080264         0    4080264   0% /dev
tmpfs                            4088412         0    4088412   0% /dev/shm
tmpfs                            4088412       768    4087644   1% /run
tmpfs                            4088412         0    4088412   0% /sys/fs/cgroup
tmpfs                            4088412         0    4088412   0% /tmp
/dev/md0                          487640    142525     315420  32% /boot

Volume Group is active:
[root@serveur ~]# vgs -a
  VG         #PV #LV #SN Attr   VSize VFree
  vg_serveur   2   3   0 wz--n- 3,64t 2,73t

Logical volume root and swap are active, but not home :
[root@serveur ~]# lvs -a
  LV              VG         Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  home            vg_serveur rwi---r--- 405,44g                                             
  [home_rimage_0] vg_serveur Iwi---r--- 405,44g                                             
  [home_rimage_1] vg_serveur Iwi---r--- 405,44g                                             
  [home_rmeta_0]  vg_serveur ewi---r---  32,00m                                             
  [home_rmeta_1]  vg_serveur ewi---r---  32,00m                                             
  root            vg_serveur rwi-aor---  50,00g                               100,00        
  [root_rimage_0] vg_serveur iwi-aor---  50,00g                                             
  [root_rimage_1] vg_serveur iwi-aor---  50,00g                                             
  [root_rmeta_0]  vg_serveur ewi-aor---  32,00m                                             
  [root_rmeta_1]  vg_serveur ewi-aor---  32,00m                                             
  swap            vg_serveur rwi-aor---   9,69g                               100,00        
  [swap_rimage_0] vg_serveur iwi-aor---   9,69g                                             
  [swap_rimage_1] vg_serveur iwi-aor---   9,69g                                             
  [swap_rmeta_0]  vg_serveur ewi-aor---  32,00m                                             
  [swap_rmeta_1]  vg_serveur ewi-aor---  32,00m                                             

I activate the home LV:
[root@serveur ~]# lvchange -a y /dev/mapper/vg_serveur-home

It gets active:
[root@serveur ~]# lvs -a
  LV              VG         Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  home            vg_serveur rwi-aor--- 405,44g                               100,00        
  [home_rimage_0] vg_serveur iwi-aor--- 405,44g                                             
  [home_rimage_1] vg_serveur iwi-aor--- 405,44g                                             
  [home_rmeta_0]  vg_serveur ewi-aor---  32,00m                                             
  [home_rmeta_1]  vg_serveur ewi-aor---  32,00m                                             
  root            vg_serveur rwi-aor---  50,00g                               100,00        
  [root_rimage_0] vg_serveur iwi-aor---  50,00g                                             
  [root_rimage_1] vg_serveur iwi-aor---  50,00g                                             
  [root_rmeta_0]  vg_serveur ewi-aor---  32,00m                                             
  [root_rmeta_1]  vg_serveur ewi-aor---  32,00m                                             
  swap            vg_serveur rwi-aor---   9,69g                               100,00        
  [swap_rimage_0] vg_serveur iwi-aor---   9,69g                                             
  [swap_rimage_1] vg_serveur iwi-aor---   9,69g                                             
  [swap_rmeta_0]  vg_serveur ewi-aor---  32,00m                                             
  [swap_rmeta_1]  vg_serveur ewi-aor---  32,00m                                             

The filesystem /home gets automatically mounted afterwards:
[root@serveur ~]# df
Sys. de fichiers            blocks de 1K   Utilisé Disponible Uti% Monté sur
/dev/mapper/vg_serveur-root     51475068  35874936   12962308  74% /
devtmpfs                         4080264         0    4080264   0% /dev
tmpfs                            4088412         0    4088412   0% /dev/shm
tmpfs                            4088412       768    4087644   1% /run
tmpfs                            4088412         0    4088412   0% /sys/fs/cgroup
tmpfs                            4088412         0    4088412   0% /tmp
/dev/md0                          487640    142525     315420  32% /boot
/dev/mapper/vg_serveur-home    418329352 307748892   89307476  78% /home

and I can get my fedora boot succesfully when exiting from emergency shell.


By the way, there are still lots of errors when looking at LVM services status on my system is fully up&running:
[root@serveur ~]# systemctl list-unit-files |grep lvm
lvm2-lvmetad.service                        enabled 
lvm2-monitor.service                        enabled 
lvm2-pvscan@.service                        static  
lvm2-lvmetad.socket                         enabled 
[root@serveur ~]# systemctl -a | grep lvm2
lvm2-activation-early.service                                                                                  not-found inactive dead      lvm2-activation-early.service
lvm2-activation.service                                                                                        not-found inactive dead      lvm2-activation.service
lvm2-lvmetad.service                                                                                           loaded    active   running   LVM2 metadata daemon
lvm2-monitor.service                                                                                           loaded    active   exited    Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
lvm2-pvscan@8:18.service                                                                                       loaded    inactive dead      LVM2 PV scan on device 8:18
lvm2-pvscan@8:2.service                                                                                        loaded    inactive dead      LVM2 PV scan on device 8:2
system-lvm2\x2dpvscan.slice                                                                                    loaded    inactive dead      system-lvm2\x2dpvscan.slice
lvm2-lvmetad.socket                                                                                            loaded    active   running   LVM2 metadata daemon socket
Comment 88 Fabrice Allibe 2014-01-08 09:29:26 EST
Created attachment 847177 [details]
journalctl -b output
Comment 89 Peter Rajnoha 2014-01-08 10:10:50 EST
(In reply to Fabrice Allibe from comment #87)
> └─sdb2                         8:18   0   1,8T  0 part  
> 

> By the way, there are still lots of errors when looking at LVM services
> status on my system is fully up&running:
> [root@serveur ~]# systemctl list-unit-files |grep lvm
> lvm2-lvmetad.service                        enabled 
> lvm2-monitor.service                        enabled 
> lvm2-pvscan@.service                        static  
> lvm2-lvmetad.socket                         enabled 
> [root@serveur ~]# systemctl -a | grep lvm2
> lvm2-activation-early.service                                               
> not-found inactive dead      lvm2-activation-early.service
> lvm2-activation.service                                                     
> not-found inactive dead      lvm2-activation.service
> lvm2-lvmetad.service                                

It's OK that the lvm2-activation-early and lvm2-activation service is not-found/inactive/dead here as when they're supposed to be used (and generated by lvm2-activation-generator) only if global/use_lvmetad=0 is set in lvm.conf - iow, in case autoactivation is not used.
                        
> loaded    active   running   LVM2 metadata daemon
> lvm2-monitor.service                                                        
> loaded    active   exited    Monitoring of LVM2 mirrors, snapshots etc.
> using dmeventd or progress polling
> lvm2-pvscan@8:18.service                                                    
> loaded    inactive dead      LVM2 PV scan on device 8:18
> lvm2-pvscan@8:2.service                                                     
> loaded    inactive dead      LVM2 PV scan on device 8:2
> system-lvm2\x2dpvscan.slice                                                

Well, yes, this is the problem - these ones should be active!

Based on the journalctl log with "pvscan[714]: device-mapper: suspend ioctl on  failed: Argument invalide", I suspect, this might be the problem with concurent VG refresh that is part of the pvscan (the patch for this problem is here https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=d8085edf65006a50608edb821b3d30947abaa838 which is not yet included in F20).

I've included the patch in this test build for now: 
http://koji.fedoraproject.org/koji/taskinfo?taskID=6374345

Please, try it and see if it helps (I'll be adding that in next official build).
Comment 90 Fabrice Allibe 2014-01-08 15:56:58 EST
Hello Peter,

It does work indeed ! :)

Just installed the following updates and updated /boot/initramfs-3.12.6-300.fc20.x86_64.img with "dracut -f" before rebooting.
:
[root@serveur ~]# tail /var/log/yum.log
...
Jan 08 21:36:48 Updated: device-mapper-libs-1.02.82-5.1.fc20.x86_64
Jan 08 21:36:49 Updated: device-mapper-1.02.82-5.1.fc20.x86_64
Jan 08 21:36:49 Updated: device-mapper-event-libs-1.02.82-5.1.fc20.x86_64
Jan 08 21:36:50 Updated: device-mapper-event-1.02.82-5.1.fc20.x86_64
Jan 08 21:36:50 Updated: lvm2-libs-2.02.103-5.1.fc20.x86_64
Jan 08 21:36:52 Updated: lvm2-2.02.103-5.1.fc20.x86_64
Jan 08 21:36:53 Updated: lvm2-cluster-2.02.103-5.1.fc20.x86_64

However, I still notice strange pvscan errors in systemctl output. I let you check the new log !

Thank you very much
Kind regards
Fabrice
Comment 91 Fabrice Allibe 2014-01-08 16:00:57 EST
Created attachment 847342 [details]
journalctl -b output with lvm2-2.02.103-5.1 installed

Boot successfull but still some strange pvscan errors
Comment 92 Alasdair Kergon 2014-01-08 16:05:21 EST
janv. 08 21:45:01 serveur lvm[683]: Monitoring RAID device vg_serveur-root for events.
janv. 08 21:45:01 serveur lvm[683]: Monitoring RAID device vg_serveur-swap for events.
janv. 08 21:45:01 serveur lvm[594]: 10 logical volume(s) in volume group "vg_serveur" monitored
janv. 08 21:45:01 serveur systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
janv. 08 21:45:01 serveur systemd-fsck[693]: /dev/md0 : propre, 404/128016 fichiers, 166872/511988 blocs
janv. 08 21:45:01 serveur systemd[1]: Found device LVM PV waN90N-dF0E-HIah-gj1T-Ac15-j3KI-Oxy6o4 on /dev/sda2.
janv. 08 21:45:01 serveur systemd[1]: Found device LVM PV waN90N-dF0E-HIah-gj1T-Ac15-j3KI-Oxy6o4 on /dev/sda2.
janv. 08 21:45:01 serveur systemd[1]: Starting LVM2 PV scan on device 8:2...
janv. 08 21:45:01 serveur systemd[1]: Found device LVM PV dlJOjF-ljgq-2j8L-edDT-HZnd-Row2-dGyqYk on /dev/sdb2.
janv. 08 21:45:01 serveur systemd[1]: Found device LVM PV dlJOjF-ljgq-2j8L-edDT-HZnd-Row2-dGyqYk on /dev/sdb2.
janv. 08 21:45:01 serveur systemd[1]: Starting LVM2 PV scan on device 8:18...
janv. 08 21:45:01 serveur lvm[683]: No longer monitoring RAID device vg_serveur-root for events.
janv. 08 21:45:01 serveur systemd[1]: Started File System Check on /dev/disk/by-uuid/d1480caf-6128-4f6c-80d9-6af012625f72.
janv. 08 21:45:01 serveur systemd[1]: Mounting /boot...
janv. 08 21:45:01 serveur pvscan[713]: device-mapper: suspend ioctl on  failed: Argument invalide
janv. 08 21:45:01 serveur pvscan[713]: Unable to suspend vg_serveur-root (253:4)
janv. 08 21:45:01 serveur lvm[683]: Monitoring RAID device vg_serveur-root for events.
janv. 08 21:45:01 serveur lvm[683]: No longer monitoring RAID device vg_serveur-swap for events.
janv. 08 21:45:01 serveur pvscan[713]: vg_serveur-swap: event deregistration failed: 713:5 No such device
janv. 08 21:45:01 serveur pvscan[713]: vg_serveur/swap: raid1 segment monitoring function failed.
janv. 08 21:45:02 serveur pvscan[713]: device-mapper: suspend ioctl on  failed: Argument invalide
janv. 08 21:45:02 serveur pvscan[713]: Unable to suspend vg_serveur-swap (253:9)
janv. 08 21:45:02 serveur lvm[683]: Monitoring RAID device vg_serveur-swap for events.
janv. 08 21:45:02 serveur lvm[683]: No longer monitoring RAID device vg_serveur-root for events.
janv. 08 21:45:02 serveur lvm[683]: Monitoring RAID device vg_serveur-root for events.
janv. 08 21:45:02 serveur systemd[1]: Mounted /boot.
janv. 08 21:45:02 serveur kernel: EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: errors=remount-ro
janv. 08 21:45:02 serveur lvm[683]: No longer monitoring RAID device vg_serveur-swap for events.
janv. 08 21:45:02 serveur kernel: md/raid1:mdX: active with 2 out of 2 mirrors
janv. 08 21:45:02 serveur lvm[683]: Monitoring RAID device vg_serveur-swap for events.
janv. 08 21:45:02 serveur lvm[683]: Monitoring RAID device vg_serveur-home for events.
janv. 08 21:45:02 serveur kernel: created bitmap (406 pages) for device mdX
janv. 08 21:45:02 serveur kernel: md/raid1:mdX: active with 2 out of 2 mirrors
janv. 08 21:45:02 serveur kernel: created bitmap (406 pages) for device mdX
janv. 08 21:45:02 serveur kernel: mdX: bitmap initialized from disk: read 26 pages, set 0 of 830336 bits
janv. 08 21:45:02 serveur pvscan[713]: device-mapper: resume ioctl on  failed: Argument invalide
janv. 08 21:45:02 serveur pvscan[713]: Unable to resume vg_serveur-home (253:14)
janv. 08 21:45:02 serveur pvscan[713]: 3 logical volume(s) in volume group "vg_serveur" now active
janv. 08 21:45:02 serveur pvscan[713]: vg_serveur: autoactivation failed.
janv. 08 21:45:02 serveur pvscan[712]: 3 logical volume(s) in volume group "vg_serveur" now active
janv. 08 21:45:02 serveur systemd[1]: Started LVM2 PV scan on device 8:18.
janv. 08 21:45:02 serveur systemd[1]: Found device /dev/mapper/vg_serveur-home.
janv. 08 21:45:02 serveur systemd[1]: Starting File System Check on /dev/mapper/vg_serveur-home...
janv. 08 21:45:02 serveur systemd[1]: Started LVM2 PV scan on device 8:2.
janv. 08 21:45:03 serveur systemd-fsck[839]: /dev/mapper/vg_serveur-home : propre, 887805/26574848 fichiers, 78638306/106283008 blocs
janv. 08 21:45:03 serveur systemd[1]: Started File System Check on /dev/mapper/vg_serveur-home.
janv. 08 21:45:03 serveur systemd[1]: Mounting /home...
janv. 08 21:45:03 serveur systemd[1]: Mounted /home.
janv. 08 21:45:03 serveur kernel: EXT4-fs (dm-14): mounted filesystem with ordered data mode. Opts: errors=remount-ro
Comment 93 Zbigniew Jędrzejewski-Szmek 2014-01-14 20:21:56 EST
Everything should work now.
Comment 94 Stuart D Gathman 2014-01-31 15:41:25 EST
Will this be backported to F19?  I am getting this on all 6 F19 installs - and no software raid either.  journalctl -b shows a timeout on pvscan, and I have to type "vgchange -ay vg_root" at the emergency shell.  It nearly always happens on the faster multi-cpu machines, less often on the slower single cpu machines.  

I was about to report a bug on systemd-204.18, but not if it is fixed now.

Note You need to log in before you can comment on or make changes to this bug.