Bug 1505075

Summary: blk-availability.service: defeats/ignores normal systemd dependencies
Product: [Community] LVM and device-mapper Reporter: Alan Jenkins <alan.christopher.jenkins>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
lvm2 sub component: blkdeactivate QA Contact: cluster-qe <cluster-qe>
Status: ASSIGNED --- Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, bureau.si2b-socles, heinzm, jbrassow, msnitzer, prajnoha, zkabelac
Version: 2.02.175Flags: rule-engine: lvm-technical-solution?
rule-engine: lvm-test-coverage?
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alan Jenkins 2017-10-21 20:29:50 UTC
> [Unit]
> Description=Availability of block devices
> After=lvm2-activation.service lvm2-lvmetad.service iscsi-shutdown.service iscsi.service iscsid.service fcoe.service
> DefaultDependencies=no
> Conflicts=shutdown.target

There's a complaint that blk-availability is not ordered Before=local-fs-pre.target (and remote-fs-pre.target).  Instead it unmounts filesystems itself (blkdeactivate -u), defeating the ordering of mount units in systemd.[1]

[1] https://lists.freedesktop.org/archives/systemd-devel/2016-November/037915.html

Without any ordering constraint against services that would still be using one of the filesystems, I'm really confused how blkdeactivate is expected to succeed at its task.  It's not passed the option to retry (which I don't think is implemented for mdadm anyway).

I can imagine that still leaves out some more exotic use case.  But the way this looks at the moment, it seems so obscure without even a "you are not expected to understand this" comment to mark that there is any reason for the obscurity.

---

It's also a shame /boot is excluded by blkdeactivate.  Otherwise, it seems this service could be relied on to solve https://bugzilla.redhat.com/show_bug.cgi?id=996475#c28 (fakeraids which are not handled by the initramfs have to be re-synced every single boot).  AFAICS on Fedora, it would work fine for this service to de-activate the block device for /boot, if blk-availability.service was properly ordered Before=local-fs-pre.target.

blk_availability_init claims it's "responsible" for unmounting filesystems, so the exclusion of /boot there is also quite mysterious.  My understanding is that /boot was often replicated using software RAID, at least prior to UEFI.  There's an implicit carve-out in the Fedora docs for using MD-RAID on /boot, where it explicitly says that LVM should not be used for it.

Comment 1 Alan Jenkins 2017-10-21 20:31:51 UTC
/fix component.

Comment 2 Peter Rajnoha 2017-10-23 08:05:37 UTC
This is the best effort service to try to properly deactivate device-mapper and MD based devices primarily.

Such deactivation service was completely missing in non-systemd environments (which this service was firstly designed for a few years ago).

In systemd environments, there is a deactivation hook on shutdown provided by systemd directly, but this is really a very last chance to deactivate any remaining devices. This systemd deactivation hook is an iteration over the list of remaining devices and iterated several times (because it's not taking into account that certain devices may already be a part of device stack and hence sitll used). This may fail with more complex device stacks (and it's not including MD devices, just loop and device-mapper):

  https://github.com/systemd/systemd/blob/master/src/core/shutdown.c

So the blkdeactivate is the best-effort service to do the deactivation for devices which may be a part of the stack still (primarily device-mapper-based devices, including LVM devices and then also MD devices).

As for the ordering of the blk-availability service, I admit it may be a bit better for systemd environments so I'll check and see if we can improve that.

The /boot is now unmounted too in recent version of blkdeactivate script - which is included in device-mapper version 1.02.144, released on 6th September 2017.

Comment 3 Alan Jenkins 2017-10-23 08:37:02 UTC
Thanks for the correction about /boot!

If the service' ordering _can't_ be worked out to fit systemd, it needs a massive disclaimer.

E.g. if /boot is unmounted prematurely, it races with dracut-shutdown.service.  In which case you don't just non-deterministically fail to unmount that filesystem.  You non-deterministically break the shutdown-to-initramfs mechanism, which is a hook systemd provides to shut down arbitrarily complex rootfs stacks.

Comment 4 bureau.si2b-socles 2020-02-19 15:39:27 UTC
Is there any change or news for this bug ?

Bug is ASSIGNED since 2017-10-23.

Can we have a status update ?

Thanks