Bug 1725389 - local-fs target completes before local fs are mounted, resulting in failure of libvirtd & other services
Summary: local-fs target completes before local fs are mounted, resulting in failure o...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-30 11:06 UTC by BugMasta
Modified: 2019-11-27 22:58 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-11-27 22:58:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description BugMasta 2019-06-30 11:06:40 UTC
Description of problem:

local-fs target completes before local fs are mounted, resulting in failure of libvirtd & other services

Version-Release number of selected component (if applicable):

systemd-libs-239-12.git8bca462.fc29.x86_64

How reproducible:


Steps to Reproduce:
1. add nofail to an entry in fstab
2. observe, as systemd local-fs target fails to wait for your nofail fs to mount


Actual results:

You will see journalctl log "systemd[1]: Reached target Local File Systems." before your nofail fs is mounted.


Expected results:

local-fs.target should wait for filesystems with nofail in fstab.
The option is called "nofail", not "nowait".
Sysv semantics were:
   nofail  :  do not report errors for this device if it does not exist.
If the device DOES exist, then we should wait for it to mount.
Sysv systems used to wait for nofail filesystems, and that is the only behaviour that makes sense.


Additional info:

Please also see bug 1725364, which has been inappropriately closed.

It was suggested that I should remove nofail from my fstab and then everything would be "good". That is false.

nofail has a specific purpose, and that is to designate a system that the system should be able to boot without. That is all. 
It does not mean we do not want to wait for that filesystem, when it is there.

Some people may find nofail useful in managing removeable devices, ie, if a removable device is not there, they want boot to continue, much like I do. That's fine. That *is* what nofail is for. But where did this idea come from that nofail means we should not wait for the device to mount, when it *is* there? It hurts *noone*. Changing the behaviour so we no longer wait at all for filesystems with nofail is a huge change, and if it forces people to remove nofail from their fstab, then it is making systems much more fragile.

This is a critical issue, which affects the fundamental robustness of systems.

It is not a joke.

Comment 1 BugMasta 2019-06-30 11:18:11 UTC
Ah for Christs Sake:

x-systemd.before=, x-systemd.after=
Configures a Before= dependency or After= between the created mount unit and another systemd unit, such as a mount unit. The argument should be a unit name or an absolute path to a mount point. This option may be specified more than once. This option is particularly useful for mount point declarations with nofail option that are mounted asynchronously but need to be mounted before or after some unit start, for example, before local-fs.target unit. See Before= and After= in systemd.unit(5) for details.

So, i can use nofail and add option: 
  x-systemd.before=local-fs.target

Once again, systemdhas found a way to do the simplest thing in the most arse-backwards way imaginable.

You can close this bug report, I give up.

Comment 2 BugMasta 2019-06-30 11:33:51 UTC
I will add though, that this x-systemd.before option had better play well with noauto...

ie i often have auto,nofail, and now I will add x-systemd.before=local-fs.target to lines in my fstab... and then, if i do remove that disk, i go in and change auto to noauto, so i dom't get an error message when if fails to mount.

So if i have noauto with x-systemd.before=local-fs.target, it had better behave sensibly.

The noauto had better take precednece over x-systemd.before=local-fs.target

If x-systemd.before=local-fs.target causes local-fs.target to fail when the fs has noauto, i am going to be livid.

Comment 3 BugMasta 2019-06-30 12:34:42 UTC
From:
https://www.freedesktop.org/software/systemd/man/systemd.special.html

local-fs.target¶
systemd-fstab-generator(3) automatically adds dependencies of type Before= to all mount units that refer to local mount points for this target unit. In addition, it adds dependencies of type Wants= to this target unit for those mounts listed in /etc/fstab that have the auto mount option set.

But:
https://www.freedesktop.org/software/systemd/man/systemd.mount.html#nofail
has:

nofail
With nofail, this mount will be only wanted, not required, by local-fs.target or remote-fs.target. Moreover the mount unit is not ordered before these target units. This means that the boot will continue without waiting for the mount unit and regardless whether the mount point can be mounted successfully.


So, there is some major stupdidity going on here.

1) the page for systemd.special says it only adds "Wants=" dependencies for fs with auto set. That is wrong, as anyone who has watched a system drop to emergency shell due to not-so-important failed mount with "auto" can tell you.

2) the page for systemd.mount says "Moreover the mount unit is not ordered before these target units." ie that is why local-fs does not wait for mounts with noauto. But why? What idiot decided this?
The page for systemd.special says "systemd-fstab-generator(3) automatically adds dependencies of type Before= to all mount units that refer to local mount points for this target unit." So who decided that noauto would change this behaviour? And why? Why would anyone do that? Why remove the "Before=" dependency for nofail? Why? There is no rational reason to do so. It is not necessary. It is an additional exception to the rule, as stated in the systemd.special page. It violates the princuple of least surprise. It is WRONG.

3) The workaround x-systemd.before=local-fs.target will causes local-fs.target to fail when the fs has noauto. I haven't tested this yet, so I can't be certain, but I bet my pessimistic 4$$ it will. This is what happens when you make a bad decision and then try to fix it with bandaids. They come unstuck, and there will be blood.

Comment 4 BugMasta 2019-06-30 12:49:28 UTC
ok x-systemd.before=local-fs.target does not currently cause local-fs.target to fail when the fs has noauto.

The logic behind all of this is so retarded though, that it wouldn't surprise me if that changes at some time in the future. Some bright spark will inadvertently break whatever bit of fragile code is currently allowing it to work, and suddenly everyone will get bitten.

Nevertheless, someone needs to fix the documentation on 

https://www.freedesktop.org/software/systemd/man/systemd.special.html

So it correctly reflects the fact that:
  systemd-fstab-generator(3) ... adds dependencies of type Requires= not Wants=) to this target unit for those mounts listed in /etc/fstab that have the auto mount option set.

And, someone needs to undo the brain-damage that currenty has nofail stopping addition of "Before=" deoendencies. Bring the "Before=" dependency back. 
It hurts noone, saves a lot of headache, and that is the logical and consistent way it should work.

Comment 5 BugMasta 2019-07-01 04:01:23 UTC
I was talking about this bug with my brother yesterday, and today he's told me he's just been hit by it...

He went do do a dnf-system-upgrade, and it failed because the upgrade files were on an fs mounted with auto, nofail.

So, on reboot, the upgrade started before the fs was mounted, and failed to find its files.


Because... noauto no longer adds the Before- dependency to local-fs.target.

Removing the before dependency is insane.

BRING IT BACK.

Comment 6 Ben Cotton 2019-10-31 19:05:02 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 7 Ben Cotton 2019-11-27 22:58:39 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.