Bug 1071591 - autofs.service doesn't always start right
Summary: autofs.service doesn't always start right
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: autofs
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ian Kent
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-02 02:47 UTC by Rodd Clarkson
Modified: 2015-06-30 01:27 UTC (History)
8 users (show)

Fixed In Version: autofs-5.0.7-42.fc20
Clone Of:
Environment:
Last Closed: 2015-06-30 01:27:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Patch - make service want network-online (1.28 KB, patch)
2015-01-21 06:43 UTC, Ian Kent
no flags Details | Diff

Description Rodd Clarkson 2014-03-02 02:47:33 UTC
Description of problem:

I've got 26 computers all running the same version of Fedora 20 and using LDAP for authentication.

When you start them most of the time it works, but on occasion, autofs doesn't start properly, and the user can log in (from the command line*) but doesn't get a home folder.  If you restart autofs, then it's works as expected.

* Starting from GDM results in it dropping back to the log in as it doesn't get a home folder to work with, or I presume this is what's happening.



Version-Release number of selected component (if applicable):

autofs.x86_64                                   1:5.0.7-40.fc20  



How reproducible:

It happens about one in every 15-20 start ups.  This is just a guess, but if I reboot all the machines, roughly 2 won't log in properly, but will if you restart autofs.service


Steps to Reproduce:
1. Start the system.
2. Try to log in.
3. If it fails, restart autofs.service (using tty2 and root, or ssh, or whatever).
4. Log in.


Actual results:

From the terminal you get:

# su - RPoddness
Last login: Sun Mar  2 13:37:43 EST 2014 on pts/0
su: warning: cannot change directory to /hubhome/RPoddness: No such file or directory
-bash-4.2$ 


From GDM it looks like it's going to work and then drops back to the Username prompt.  No bad password or other error is reported.



Expected results:

Logging in just works.



Additional info:

When it fails, systemctl status autofs reports:

# systemctl status autofs -l
autofs.service - Automounts filesystems on demand
   Loaded: loaded (/usr/lib/systemd/system/autofs.service; enabled)
   Active: active (running) since Sun 2014-03-02 13:24:25 EST; 13min ago
  Process: 796 ExecStart=/usr/sbin/automount $OPTIONS --pid-file /run/autofs.pid (code=exited, status=0/SUCCESS)
 Main PID: 816 (automount)
   CGroup: /system.slice/autofs.service
           └─816 /usr/sbin/automount --pid-file /run/autofs.pid

Mar 02 13:24:25 fedora26.lab.mrss.com.au automount[816]: setautomntent: lookup(sss): setautomntent: No such file or directory
Mar 02 13:24:25 fedora26.lab.mrss.com.au systemd[1]: Started Automounts filesystems on demand.


If I restart autofs I get:

[root@fedora26 ~]# systemctl restart autofs -l
[root@fedora26 ~]# systemctl status autofs -l
autofs.service - Automounts filesystems on demand
   Loaded: loaded (/usr/lib/systemd/system/autofs.service; enabled)
   Active: active (running) since Sun 2014-03-02 13:46:00 EST; 6s ago
  Process: 1542 ExecStart=/usr/sbin/automount $OPTIONS --pid-file /run/autofs.pid (code=exited, status=0/SUCCESS)
 Main PID: 1544 (automount)
   CGroup: /system.slice/autofs.service
           └─1544 /usr/sbin/automount --pid-file /run/autofs.pid

Mar 02 13:46:00 fedora26.lab.mrss.com.au systemd[1]: Started Automounts filesystems on demand.
[root@fedora26 ~]# su - RPoddness
Last login: Sun Mar  2 13:43:44 EST 2014 on pts/1
[RPoddness@fedora26 ~]$

Comment 1 Ian Kent 2014-03-02 08:06:25 UTC
It sounds like what your saying is that the autofs systemd
unit is not waiting for sssd to start up.

But the unit file has:
After=network.target ypbind.service sssd.service

so I don't know what else to say.

Comment 2 Ian Collier 2014-08-06 11:35:10 UTC
It appears to me that Fedora 20 has changed the networking targets in such a
way that you need to wait for network-online.target (not just network.target)
if you are expecting the network to be reachable when you start your service.

I may be misunderstanding this, but our F20 installs are now unanimously failing
to start autofs properly, and adding network-online.target to the "After" line
in the unit file seems to fix it.  (It's odd though that I haven't noticed
the problem until now, and yet all of today's installs have this issue.)

Comment 3 Jason Tibbitts 2014-12-19 17:33:18 UTC
I've been having this problem for a while too.  It doesn't appear to be fixed in F21, either; only sssd caching the maps seems to be saving me most of the time.

You wouldn't have had the problem immediately after F20 release because systemd changed incompatibly.  That's an incredibly horrible thing to do, and I can't believe they did it, but they wouldn't revert it.  That's the way it's going to be moving forward.

Still, it's trivially fixed.  I can fix up the units in the package if Ian wants me to do so.  In the meantime, you just have to use a drop-in:

mkdir /etc/systemd/system/autofs.service.d
cat > /etc/systemd/system/autofs.service.d/autofs.conf <<END
[Unit]
Wants=network-online.target
After=network-online.target
END

autofs should start properly on the next reboot.  I do this in kickstart post.

Comment 4 Ian Kent 2014-12-21 03:35:16 UTC
(In reply to Jason Tibbitts from comment #3)
> I've been having this problem for a while too.  It doesn't appear to be
> fixed in F21, either; only sssd caching the maps seems to be saving me most
> of the time.
> 
> You wouldn't have had the problem immediately after F20 release because
> systemd changed incompatibly.  That's an incredibly horrible thing to do,
> and I can't believe they did it, but they wouldn't revert it.  That's the
> way it's going to be moving forward.
> 
> Still, it's trivially fixed.  I can fix up the units in the package if Ian
> wants me to do so.  In the meantime, you just have to use a drop-in:
> 
> mkdir /etc/systemd/system/autofs.service.d
> cat > /etc/systemd/system/autofs.service.d/autofs.conf <<END
> [Unit]
> Wants=network-online.target
> After=network-online.target
> END
> 
> autofs should start properly on the next reboot.  I do this in kickstart
> post.

I think it's worth doing that for now, it can be removed later.

In time sssd should be updated to respond with an appropriate
fail until the initial map read is completed and autofs updated
to wait on error returns at startup.

I now believe the reason this approach didn't work properly (the
last time it was tried ) was because of sssd not returning what
autofs needed to make it wait.

In fact I need to get back to this some time soonish to work
out where we're at, sorry for the delays.

Ian

Comment 5 Jason Tibbitts 2014-12-22 16:50:16 UTC
Just to make sure, was that a "please go ahead and modify the unit file in the autofs package" or was that "everyone should just use a drop in for now"?  And if the former, just rawhide or f20 and f21 as well?  Do you want me to push an update?  I'm happy to do it and save you the time, but of course it's your package.

I agree that the best thing is to wait for sssd to say it's good; it may have cached data that allows startup without waiting for the network.  What would be really cool is if autofs could somehow just dynamically configure itself when the the auto.master map changes.

Comment 6 Ian Kent 2015-01-02 00:54:59 UTC
(In reply to Jason Tibbitts from comment #5)
> Just to make sure, was that a "please go ahead and modify the unit file in
> the autofs package" or was that "everyone should just use a drop in for
> now"?  And if the former, just rawhide or f20 and f21 as well?  Do you want
> me to push an update?  I'm happy to do it and save you the time, but of
> course it's your package.

Basically, yes go ahead and modify the unit file, and yes f20 and f21.
I may end up doing it myself since I'm close to being back on deck now
but don't let that stop you if you have time.

I don't think it can hurt to have this present regardless of
what is done in the future.

> 
> I agree that the best thing is to wait for sssd to say it's good; it may
> have cached data that allows startup without waiting for the network.  What
> would be really cool is if autofs could somehow just dynamically configure
> itself when the the auto.master map changes.

TBH I've resisted monitoring for events because it's quite hard
to do for the various sources since not all of them have the
ability to trigger an event, like auto.master coming from NIS.

But at startup automount should retry (with a delay) until it gets
a valid master map and that's about all we'll be able to do for
now, assuming sssd returns an error until it has what it thinks
is a valid master map (and assuming I'm actually correct about the
problem). That of course also assumes that if it gets a master map
it will also get dependent maps which it might not si that could
also be a problem in the long run.

Ian

Comment 7 Fedora Update System 2015-01-21 05:52:21 UTC
autofs-5.0.7-42.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/autofs-5.0.7-42.fc20

Comment 8 Ian Kent 2015-01-21 06:43:47 UTC
Created attachment 982179 [details]
Patch - make service want network-online

Comment 9 Jason Tibbitts 2015-01-21 19:16:41 UTC
Well, crap; I totally forgot to take care of this for you.  Did you intend to do the same for F21?

Comment 10 Fedora Update System 2015-01-21 23:06:17 UTC
Package autofs-5.0.7-42.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing autofs-5.0.7-42.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-1018/autofs-5.0.7-42.fc20
then log in and leave karma (feedback).

Comment 11 Ian Kent 2015-01-22 01:52:50 UTC
(In reply to Jason Tibbitts from comment #9)
> Well, crap; I totally forgot to take care of this for you.  Did you intend
> to do the same for F21?

LOL.

Rawhide and F21 also done.
Will update them if needed as a result of testing here.

Comment 12 info@kobaltwit.be 2015-01-23 20:21:25 UTC
I came across this bug by browsing on bodhi and it looked like it could fix an issue I have on my laptop so I decided to install and test the version of autofs in updates-testing for F21.

Here is my specific situation:
- My laptop is configured such that when I log into my kde session it attempts to connect to my wifi network.
- Kde however asks a password first to unlock kdewallet which safely keeps the wifi passphrase.
- So until I have entered this master password and network manager can retrieve the wifi's passphrase I'm effectively not connected to any network.
- I don't have my home directories mounted via autofs so I can use this situation to test autofs behaviour relative to the network being on- or offline.

So to test I just log in, but don't enter the kwallet password yet. The result is that the network never gets connected.

With both the current autofs in F21 and the one in updates-testing I see that at this point autofs is already running even though there is no network connection yet.

I also ran
sudo status network-online.target
which gave me:
$ sudo systemctl status network-online.target
● network-online.target - Network is Online
   Loaded: loaded (/usr/lib/systemd/system/network-online.target; static)
   Active: active since vr 2015-01-23 20:53:50 CET; 1min 21s ago
     Docs: man:systemd.special(7)
           http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget

Which seems to indicate that the target network-online is reached before I actually have a network connection.

Perhaps this is rather a Networkmanager bug than a problem with autofs using the network-online target though. For that reason I don't know whether I should give negative karma on bodhi or not.

Comment 13 Jason Tibbitts 2015-02-17 18:41:50 UTC
The update certainly fixes my issue, though I've been fixing this with a systemd drop-in until the update is rolled out.  Not sure if you want to push it or not.

Regarding comment 12, I don't think autofs can do anything other than wait until networkmanager says the network is online.  There's no other metric in the system for determine "online-ness".  If NM is reporting that it's online when it isn't, that would certainly be a NM issue, but even then all it can really do is report that the interface says it's up.  It might even be more of a kernel issue, or of a difference in the definition of "online".

Comment 14 Fedora Update System 2015-02-19 18:02:09 UTC
autofs-5.0.7-42.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 15 Eric Work 2015-02-24 09:35:25 UTC
Because of this change my boot time has gone way up (3 sec userspace to 11 sec, total was 21 sec before).  I pushed to have rpc-statd-notify.service changed to network.target instead of network-online.target since it runs in the background forked so I could remove all dependencies on network-online.target.  Now autofs.service brings this service back.  It sounds like the problem is that sssd.service should be waiting for the network not autofs.service.  I'd like to vote for the drop-in because this kills boot times for anyone using autofs to postpone NFS mount time until needed and doesn't need nss services running.

Comment 16 Juha Tuomala 2015-02-24 10:25:53 UTC
autofs-5.0.7-42.fc20.x86_64 here.

I don't boot that often and before this update, I had to restart autofs to get user's home mounted. Last time I did boot, the home wasn't there again and graphical login failed. But it did came up without service restart after doing some ls ~tuju; ls ~tuju as root.

Keeping my eyes open wether the issue is gone or not. Good work anyway, thanks for tackling this one. One step ahead to big sites.

Comment 17 Eric Work 2015-02-24 17:49:47 UTC
For a mount like /home that always needs to be mounted and requires waiting for the network to boot/login, wouldn't /etc/fstab be more appropriate?  I always thought autofs was for delayed mounting (as needed) or for uncommon mount requests.  I know it has some other features like LDAP/NSS integration and being able to expire off mounts, but if something is required for boot/login shouldn't it be in /etc/fstab?  For large deployments /etc/fstab can be managed with puppet or chef with similar benefits to using LDAP.  I know the whole reason I installed autofs was to avoid network access during boot so I can mount my NAS when I need it and don't mount it when I don't.  Do you guys have some other usage I didn't mention?

Comment 18 Ian Collier 2015-02-24 18:13:04 UTC
That works fine if your automount maps are stored in local files; however, the problem being described in this bug is that if your maps are on the network then autofs needs network access from the beginning, because it wants the maps when it starts up and it won't go back and load the maps if that failed because the system is not online yet.  Comment 6 contains a suggestion about changing that, but for now, autofs needs the network when it starts up.

Comment 19 Fedora End Of Life 2015-05-29 11:07:08 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 20 info@kobaltwit.be 2015-05-29 12:33:41 UTC
(In reply to Jason Tibbitts from comment #13)
> Regarding comment 12, I don't think autofs can do anything other than wait
> until networkmanager says the network is online.  There's no other metric in
> the system for determine "online-ness".  If NM is reporting that it's online
> when it isn't, that would certainly be a NM issue, but even then all it can
> really do is report that the interface says it's up.  It might even be more
> of a kernel issue, or of a difference in the definition of "online".

I don't know about these details or how this can properly be fixed unfortunately. My simple understanding of this would also be that NetworkManager shouldn't emit a network-online message if it's still waiting for a password. But as you say there may be deeper issues regarding this.

I just want to share my workaround for this in case others are in a similar situation: I have configured my wlan connection at home to be system wide, instead of user only. That way it gets initiated earlier and I'm not being asked for a password at login. As such the wifi is available at the time autofs is activated.

Comment 21 Eric Work 2015-06-17 21:14:17 UTC
For anyone else following this bug report in a situation similar to mine (don't need network at boot), I applied the following workaround inline with the drop-in approach.

sudo systemctl mask NetworkManager-wait-online.service

Since I'm running F22 on a laptop that's completely stand-alone I simply disabled waiting for the network on boot and saved almost 1 minute.  I now understand why autofs.service needs to depend on network-online.target by default for the reasons people mentioned, but for me the above workaround improved the boot time significantly without any side-effects.  Also in the event another service pops up that also requires network-online.target and it's not needed in my particular scenario this will keep the boot time quick.

Comment 22 Fedora End Of Life 2015-06-30 01:27:17 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.