Bug 1953390 - Autofs timeout when mounting dot files or directories
Summary: Autofs timeout when mounting dot files or directories
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: autofs
Version: 34
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Ian Kent
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-26 01:13 UTC by Alexi Dimitriadis
Modified: 2022-06-08 06:24 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-06-08 06:24:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
LDAP Autofs Configuration (1.06 KB, text/plain)
2021-04-26 10:14 UTC, Alexi Dimitriadis
no flags Details
SSSD Autofs Configuration (378 bytes, text/plain)
2021-04-26 10:17 UTC, Alexi Dimitriadis
no flags Details
SSSD Autofs Debug Log (204.29 KB, text/plain)
2021-04-27 03:20 UTC, Alexi Dimitriadis
no flags Details
Autofs Debug Log (1.73 KB, text/plain)
2021-04-27 03:21 UTC, Alexi Dimitriadis
no flags Details

Description Alexi Dimitriadis 2021-04-26 01:13:45 UTC
Description of problem:
When attempting to mount non-existent dot files or directories, autofs hangs indefinitely and fails after retrying for 50 seconds.
We are using SSSD autofs (ldap_autofs_xxx configuration) to point autofs to the automap configuration location.
This results in any application (e.g. kate) trying to read/write files to/from an NFS mountpoint from autofs hanging for the same duration.

Version-Release number of selected component (if applicable):
5.1.7-12.fc34 and 5.1.7-11.fc34

How reproducible:
Always.

Steps to Reproduce:
1. Install the applicable version-release of autofs
2. Configure SSSD autofs to point to your LDAP automap config
3. Run the command "stat /home/users/.test", where /home/users is an autofs mountpoint and .test is a non-existent dot file/directory

Actual results:
After 50 seconds the command times out. Autofs debug log looks like this:
Apr 26 10:54:03 automount[12378]: handle_packet: type = 3
Apr 26 10:54:03 automount[12378]: handle_packet_missing_indirect: token 29, name .test, request pid 12399
Apr 26 10:54:03 automount[12378]: attempting to mount entry /home/users/.test
Apr 26 10:54:03 automount[12378]: lookup_mount: lookup(sss): looking up .test
Apr 26 10:54:03 automount[12378]: can't contact sssd to to lookup key value, retry for 50 seconds

Expected results:
Stat command fails immediately (see additional info)

Additional info:
This issue does not exist when using autofs-5.1.6-11.fc33. The stat fails immediately with the following autofs debug log:
Apr 26 10:50:09 automount[11921]: handle_packet: type = 3
Apr 26 10:50:09 automount[11921]: handle_packet_missing_indirect: token 19, name .test, request pid 11958
Apr 26 10:50:09 automount[11921]: attempting to mount entry /home/users/.test
Apr 26 10:50:09 automount[11921]: lookup_mount: lookup(sss): looking up .test
Apr 26 10:50:09 automount[11921]: dev_ioctl_send_fail: token = 19
Apr 26 10:50:09 automount[11921]: failed to mount /home/users/.test

Comment 1 Ian Kent 2021-04-26 09:41:55 UTC
(In reply to Alexi Dimitriadis from comment #0)
> Description of problem:
> When attempting to mount non-existent dot files or directories, autofs hangs
> indefinitely and fails after retrying for 50 seconds.
> We are using SSSD autofs (ldap_autofs_xxx configuration) to point autofs to
> the automap configuration location.
> This results in any application (e.g. kate) trying to read/write files
> to/from an NFS mountpoint from autofs hanging for the same duration.
> 
> Version-Release number of selected component (if applicable):
> 5.1.7-12.fc34 and 5.1.7-11.fc34
> 
> How reproducible:
> Always.
> 
> Steps to Reproduce:
> 1. Install the applicable version-release of autofs
> 2. Configure SSSD autofs to point to your LDAP automap config
> 3. Run the command "stat /home/users/.test", where /home/users is an autofs
> mountpoint and .test is a non-existent dot file/directory

Is this only for wildcard map entries.

> 
> Actual results:
> After 50 seconds the command times out. Autofs debug log looks like this:
> Apr 26 10:54:03 automount[12378]: handle_packet: type = 3
> Apr 26 10:54:03 automount[12378]: handle_packet_missing_indirect: token 29,
> name .test, request pid 12399
> Apr 26 10:54:03 automount[12378]: attempting to mount entry /home/users/.test
> Apr 26 10:54:03 automount[12378]: lookup_mount: lookup(sss): looking up .test
> Apr 26 10:54:03 automount[12378]: can't contact sssd to to lookup key value,
> retry for 50 seconds

The revision numbers have been messed up somehow.

The changes to sss connection handling were included in upstream 5.1.7 so if
the sss coneection handling is broken it should have been seen in any 5.1.7
revision.

Even though the sss handling wasn't changed there may be something else
happening but I don't see it yet.

Are you sure the sss ldap server is up and available and sss on the local
machine is up and running?

I tried lots of scenarios and got the correct error returns from sss.
What version of sssd is being used?

> 
> Expected results:
> Stat command fails immediately (see additional info)
> 
> Additional info:
> This issue does not exist when using autofs-5.1.6-11.fc33. The stat fails
> immediately with the following autofs debug log:
> Apr 26 10:50:09 automount[11921]: handle_packet: type = 3
> Apr 26 10:50:09 automount[11921]: handle_packet_missing_indirect: token 19,
> name .test, request pid 11958
> Apr 26 10:50:09 automount[11921]: attempting to mount entry /home/users/.test
> Apr 26 10:50:09 automount[11921]: lookup_mount: lookup(sss): looking up .test
> Apr 26 10:50:09 automount[11921]: dev_ioctl_send_fail: token = 19
> Apr 26 10:50:09 automount[11921]: failed to mount /home/users/.test

Well, yes, as log as sss can perform the lookup or your using an older revision
of sssd but that code didn't change between those revisions and yet that's were
it's seeing the problem.

It appears you have an updated sssd and your getting an EHOSTDOWN from the
autofs sss library which means wait for the host to become available, rather
than an ECONNREFUSED (sssd not running) which returns an error straight away.

Ian

Comment 2 Alexi Dimitriadis 2021-04-26 10:13:39 UTC
> Is this only for wildcard map entries.
I believe so. My materialized mountpoint configuration looks like this:
# automount -m | grep users
auto.homeusers on /home/users type autofs (rw,relatime,fd=19,pgrp=41429,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=392856)
home-users.nfs.lan.noggin.com.au:/home/users/adimitriadis on /home/users/adimitriadis type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=w.x.y.z.,local_lock=none,addr=a.b.c.d)

A stat of /home/users/.non-existent-file fails after 50 seconds (ENOENT).
A stat of /home/users/adimitriadis/.non-existent-file fails immediately (ENOENT).
I have also attached the relevant LDAP object configuration consumed by LDAP.

> The revision numbers have been messed up somehow.
>
> The changes to sss connection handling were included in upstream 5.1.7 so if
> the sss coneection handling is broken it should have been seen in any 5.1.7
> revision.
Please ignore autofs-5.1.7-11. There was an fedora34 update released over the past weekend that upgraded my autofs to the current installed version of 5.1.7-12.fc34.
I assumed the previous version was 5.1.7-11, but this was incorrect. I've checked again and the correct version was actually 5.1.7-10.
So to be clear, both 5.1.7-10 and 5.1.7-12 have the issue I am reporting.

> Even though the sss handling wasn't changed there may be something else
> happening but I don't see it yet.

> Are you sure the sss ldap server is up and available and sss on the local
> machine is up and running?
Yes, I am confident that sss is running and available to the local machine.
Specifically because downgrading the autofs version to autofs-5.1.6-11.fc33 changes the behaviour (causes the stat command to fail immediately).
The downgrade was done without modifying or restarting the sss daemon (sssd service) in any way.
Because of this I raised the bug as an autofs one rather than sssd.

> I tried lots of scenarios and got the correct error returns from sss.
> What version of sssd is being used?
sssd-2.4.2-33.fc34

Comment 3 Alexi Dimitriadis 2021-04-26 10:14:03 UTC
Created attachment 1775456 [details]
LDAP Autofs Configuration

Comment 4 Alexi Dimitriadis 2021-04-26 10:17:02 UTC
Created attachment 1775457 [details]
SSSD Autofs Configuration

Comment 5 Ian Kent 2021-04-27 03:00:25 UTC
(In reply to Alexi Dimitriadis from comment #2)
> > Is this only for wildcard map entries.
> I believe so. My materialized mountpoint configuration looks like this:
> # automount -m | grep users
> auto.homeusers on /home/users type autofs
> (rw,relatime,fd=19,pgrp=41429,timeout=300,minproto=5,maxproto=5,indirect,
> pipe_ino=392856)
> home-users.nfs.lan.noggin.com.au:/home/users/adimitriadis on
> /home/users/adimitriadis type nfs4
> (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,
> timeo=600,retrans=2,sec=sys,clientaddr=w.x.y.z.,local_lock=none,addr=a.b.c.d)

Can't tell from this if auto.users contains a wildcard map entry.
Never mind we'll push on anyway.

> 
> A stat of /home/users/.non-existent-file fails after 50 seconds (ENOENT).
> A stat of /home/users/adimitriadis/.non-existent-file fails immediately
> (ENOENT).
> I have also attached the relevant LDAP object configuration consumed by LDAP.

Probably not needed.

> 
> > The revision numbers have been messed up somehow.
> >
> > The changes to sss connection handling were included in upstream 5.1.7 so if
> > the sss coneection handling is broken it should have been seen in any 5.1.7
> > revision.
> Please ignore autofs-5.1.7-11. There was an fedora34 update released over
> the past weekend that upgraded my autofs to the current installed version of
> 5.1.7-12.fc34.
> I assumed the previous version was 5.1.7-11, but this was incorrect. I've
> checked again and the correct version was actually 5.1.7-10.
> So to be clear, both 5.1.7-10 and 5.1.7-12 have the issue I am reporting.

Ok, that makes sense.
There are significant changes to the autofs <-> sss communication in these
releases provided an updated sssd is also in use.

snip ...

> 
> > Are you sure the sss ldap server is up and available and sss on the local
> > machine is up and running?
> Yes, I am confident that sss is running and available to the local machine.
> Specifically because downgrading the autofs version to autofs-5.1.6-11.fc33
> changes the behaviour (causes the stat command to fail immediately).

It doesn't necessarily follow that the downgrade proves everything is up and
running, there are changes in those newer autofs releases that handle additional
returns from newer sss autofs library releases.

It looks like autofs is getting an EHOSTDOWN return from the sss autofs library
which is meant to cause autofs to wait a while for the host to come up.

Is the LDAP server configured in sss responding to queries?

> The downgrade was done without modifying or restarting the sss daemon (sssd
> service) in any way.
> Because of this I raised the bug as an autofs one rather than sssd.

Yes, that's fine, there does seem to be something wrong with the error
return handling.

But be aware that autofs should now wait for a while if sss says the
host is down. It doesn't wait quite as long as sss itself for the host
to come up but it is meant to catch most cases. But I suspect there is
something wrong with the autofs error handling rather than the sss
autofs library.

What would be helpful is a debug log of the whole interaction not just
the snippets above.

I'll need to try and replicate the problem so I can dig deeper into
what's happening.

Those spurious key lookups are frequently a problem, AFAIK they come
from GUI utilities looking up paths they probably shouldn't but autofs
should see them a non-existent keys and return a failure immediately,
as you expect.

Ian

Comment 6 Ian Kent 2021-04-27 03:11:14 UTC
(In reply to Ian Kent from comment #5)
> 
> I'll need to try and replicate the problem so I can dig deeper into
> what's happening.
> 

In the mean time you should be able to reduce the wait to 5 seconds
by setting "sss_master_map_wait = 1" in /etc/autofs.conf.

Comment 7 Alexi Dimitriadis 2021-04-27 03:20:55 UTC
Created attachment 1775773 [details]
SSSD Autofs Debug Log

Comment 8 Alexi Dimitriadis 2021-04-27 03:21:14 UTC
Created attachment 1775774 [details]
Autofs Debug Log

Comment 9 Alexi Dimitriadis 2021-04-27 03:24:16 UTC
> What would be helpful is a debug log of the whole interaction not just
> the snippets above.
I have attached debug logs for both autofs and sssd-autofs. These logs are for the entire duration of the 'stat' command.

> In the mean time you should be able to reduce the wait to 5 seconds
> by setting "sss_master_map_wait = 1" in /etc/autofs.conf.
Confirmed that causes the stat command to fail after 5 seconds.

Comment 10 Ian Kent 2021-04-27 05:05:16 UTC
I'll have a look t those logs but I'm having trouble duplicating an actual
problem.

-- Journal begins at Tue 2021-04-27 12:01:08 AWST, ends at Tue 2021-04-27 20:01:07 AWST. --

# ldap server is up and running, startup work ok.
Apr 27 12:28:15 fedora automount[4091]: Starting automounter version 5.1.7-12.fc34, master map auto.master
Apr 27 12:28:15 fedora automount[4091]: using kernel protocol version 5.05
Apr 27 12:28:15 fedora automount[4091]: lookup_nss_read_master: reading master files auto.master
Apr 27 12:28:15 fedora automount[4091]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:28:15 fedora automount[4091]: lookup_read_master: lookup(file): read entry +auto.master
Apr 27 12:28:15 fedora automount[4091]: lookup_nss_read_master: reading master files auto.master
Apr 27 12:28:15 fedora automount[4091]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:28:15 fedora automount[4091]: lookup_nss_read_master: reading master sss auto.master
Apr 27 12:28:15 fedora automount[4091]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:28:16 fedora automount[4091]: master_do_mount: mounting /-
Apr 27 12:28:16 fedora automount[4091]: automount_path_to_fifo: fifo name /run/autofs.fifo--
Apr 27 12:28:16 fedora automount[4091]: lookup_nss_read_map: reading map files auto.direct
Apr 27 12:28:16 fedora automount[4091]: lookup_nss_read_map: reading map sss auto.direct
Apr 27 12:28:16 fedora automount[4091]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:28:16 fedora automount[4091]: mounted direct on /nfs/gandalf/bin with timeout 300, freq 75 seconds
Apr 27 12:28:16 fedora automount[4091]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/bin
Apr 27 12:28:16 fedora automount[4091]: mounted direct on /nfs/gandalf/etc with timeout 300, freq 75 seconds
Apr 27 12:28:16 fedora automount[4091]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/etc
Apr 27 12:28:16 fedora automount[4091]: mounted direct on /nfs/gandalf/man with timeout 300, freq 75 seconds
Apr 27 12:28:16 fedora automount[4091]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/man
Apr 27 12:28:16 fedora automount[4091]: mounted direct on /nfs/gandalf/test with timeout 300, freq 75 seconds
Apr 27 12:28:16 fedora automount[4091]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/test
Apr 27 12:28:16 fedora automount[4091]: mounted direct on /nfs/gandalf/sbin with timeout 300, freq 75 seconds
Apr 27 12:28:16 fedora automount[4091]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/sbin
Apr 27 12:28:16 fedora automount[4091]: st_ready: st_ready(): state = 0 path /-
Apr 27 12:28:16 fedora automount[4091]: master_do_mount: mounting /test
Apr 27 12:28:16 fedora automount[4091]: automount_path_to_fifo: fifo name /run/autofs.fifo-test
Apr 27 12:28:16 fedora automount[4091]: lookup_nss_read_map: reading map files auto.indirect
Apr 27 12:28:16 fedora automount[4091]: lookup_nss_read_map: reading map sss auto.indirect
Apr 27 12:28:16 fedora automount[4091]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:28:16 fedora automount[4091]: lookup_read_map: map read not needed, so not done
Apr 27 12:28:16 fedora automount[4091]: mounted indirect on /test with timeout 300, freq 75 seconds
Apr 27 12:28:16 fedora automount[4091]: st_ready: st_ready(): state = 0 path /test
Apr 27 12:28:16 fedora systemd[1]: Started Automounts filesystems on demand.

# and /test/.bogus lookup fails as expected.
Apr 27 12:29:47 fedora automount[4091]: handle_packet: type = 3
Apr 27 12:29:47 fedora automount[4091]: handle_packet_missing_indirect: token 1, name .bogus, request pid 4181
Apr 27 12:29:47 fedora automount[4091]: attempting to mount entry /test/.bogus
Apr 27 12:29:47 fedora automount[4091]: lookup_mount: lookup(sss): looking up .bogus
Apr 27 12:29:49 fedora automount[4091]: key ".bogus" not found in map source(s).
Apr 27 12:29:49 fedora automount[4091]: dev_ioctl_send_fail: token = 1
Apr 27 12:29:49 fedora automount[4091]: failed to mount /test/.bogus
Apr 27 12:29:49 fedora automount[4091]: handle_packet: type = 3
Apr 27 12:29:49 fedora automount[4091]: handle_packet_missing_indirect: token 2, name .bogus, request pid 4181
Apr 27 12:29:49 fedora automount[4091]: dev_ioctl_send_fail: token = 2
Apr 27 12:30:10 fedora sssd_be[3915]: Backend is online

# and /test/.test lookup fails as expected.
Apr 27 12:30:42 fedora automount[4091]: handle_packet: type = 3
Apr 27 12:30:42 fedora automount[4091]: handle_packet_missing_indirect: token 3, name .test, request pid 4199
Apr 27 12:30:42 fedora automount[4091]: attempting to mount entry /test/.test
Apr 27 12:30:42 fedora automount[4091]: lookup_mount: lookup(sss): looking up .test
Apr 27 12:30:43 fedora automount[4091]: key ".test" not found in map source(s).
Apr 27 12:30:43 fedora automount[4091]: dev_ioctl_send_fail: token = 3
Apr 27 12:30:43 fedora automount[4091]: failed to mount /test/.test
Apr 27 12:30:43 fedora automount[4091]: handle_packet: type = 3
Apr 27 12:30:43 fedora automount[4091]: handle_packet_missing_indirect: token 4, name .test, request pid 4199
Apr 27 12:30:43 fedora automount[4091]: dev_ioctl_send_fail: token = 4

# ldap server shut down, EHOSTDOWN is seen and mount attempt waits, then fails, sss_master_map_wait = 10 (default value).
Apr 27 12:32:16 fedora automount[4091]: handle_packet: type = 3
Apr 27 12:32:16 fedora automount[4091]: handle_packet_missing_indirect: token 5, name .test, request pid 4230
Apr 27 12:32:16 fedora automount[4091]: attempting to mount entry /test/.test
Apr 27 12:32:16 fedora automount[4091]: lookup_mount: lookup(sss): looking up .test
Apr 27 12:32:16 fedora automount[4091]: can't contact sssd to to lookup key value, retry for 50 seconds
Apr 27 12:33:06 fedora automount[4091]: getautomntbyname: lookup(sss): getautomntbyname: Host is down
Apr 27 12:33:06 fedora automount[4091]: lookup(sss): lookup for .test failed: connection failed
Apr 27 12:33:06 fedora automount[4091]: key ".test" not found in map source(s).
Apr 27 12:33:06 fedora automount[4091]: dev_ioctl_send_fail: token = 5
Apr 27 12:33:06 fedora automount[4091]: failed to mount /test/.test
Apr 27 12:33:06 fedora automount[4091]: handle_packet: type = 3
Apr 27 12:33:06 fedora automount[4091]: handle_packet_missing_indirect: token 6, name .test, request pid 4230
Apr 27 12:33:06 fedora automount[4091]: dev_ioctl_send_fail: token = 6
Apr 27 12:33:45 fedora sssd_be[3915]: Backend is offline

# shutdown autofs
Apr 27 12:33:56 fedora automount[4091]: do_notify_state: signal 15
Apr 27 12:33:56 fedora automount[4091]: master_notify_state_change: sig 15 switching /- from 1 to 5
Apr 27 12:33:56 fedora systemd[1]: Stopping Automounts filesystems on demand...
Apr 27 12:33:56 fedora automount[4091]: master_notify_state_change: sig 15 switching /test from 1 to 5
Apr 27 12:33:56 fedora automount[4091]: st_prepare_shutdown: state 1 path /test
Apr 27 12:33:56 fedora automount[4091]: expire_proc: exp_proc = 139641171613248 path /test
Apr 27 12:33:56 fedora automount[4091]: st_prepare_shutdown: state 1 path /-
Apr 27 12:33:56 fedora automount[4091]: expire_proc: exp_proc = 139641152853568 path /-
Apr 27 12:33:56 fedora automount[4091]: expire_cleanup: got thid 139641171613248 path /test stat 0
Apr 27 12:33:56 fedora automount[4091]: expire_cleanup: sigchld: exp 139641171613248 finished, switching from 5 to 7
Apr 27 12:33:56 fedora automount[4091]: st_shutdown: state 5 path /test
Apr 27 12:33:56 fedora automount[4091]: expire_cleanup: got thid 139641152853568 path /- stat 0
Apr 27 12:33:56 fedora automount[4091]: expire_cleanup: sigchld: exp 139641152853568 finished, switching from 5 to 7
Apr 27 12:33:56 fedora automount[4091]: st_shutdown: state 5 path /-
Apr 27 12:33:56 fedora automount[4091]: umount_multi: path /test incl 0
Apr 27 12:33:56 fedora automount[4091]: umounted indirect mount /test
Apr 27 12:33:56 fedora automount[4091]: automount_path_to_fifo: fifo name /run/autofs.fifo-test
Apr 27 12:33:56 fedora automount[4091]: shut down path /test
Apr 27 12:33:56 fedora automount[4091]: umount_multi: path /nfs/gandalf/bin incl 0
Apr 27 12:33:56 fedora automount[4091]: umounted direct mount /nfs/gandalf/bin
Apr 27 12:33:56 fedora automount[4091]: umount_multi: path /nfs/gandalf/etc incl 0
Apr 27 12:33:56 fedora automount[4091]: umounted direct mount /nfs/gandalf/etc
Apr 27 12:33:56 fedora automount[4091]: umount_multi: path /nfs/gandalf/man incl 0
Apr 27 12:33:56 fedora automount[4091]: umounted direct mount /nfs/gandalf/man
Apr 27 12:33:56 fedora automount[4091]: umount_multi: path /nfs/gandalf/test incl 0
Apr 27 12:33:56 fedora automount[4091]: umounted direct mount /nfs/gandalf/test
Apr 27 12:33:56 fedora automount[4091]: umount_multi: path /nfs/gandalf/sbin incl 0
Apr 27 12:33:56 fedora automount[4091]: umounted direct mount /nfs/gandalf/sbin
Apr 27 12:33:56 fedora automount[4091]: automount_path_to_fifo: fifo name /run/autofs.fifo--
Apr 27 12:33:56 fedora automount[4091]: shut down path /-
Apr 27 12:33:56 fedora automount[4091]: autofs stopped
Apr 27 12:33:56 fedora systemd[1]: autofs.service: Deactivated successfully.
Apr 27 12:33:56 fedora systemd[1]: Stopped Automounts filesystems on demand.
Apr 27 12:33:56 fedora systemd[1]: Starting Automounts filesystems on demand...

# start autofs with backend ldap serer down
Apr 27 12:33:56 fedora automount[4289]: Starting automounter version 5.1.7-12.fc34, master map auto.master
Apr 27 12:33:56 fedora automount[4289]: using kernel protocol version 5.05
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_master: reading master files auto.master
Apr 27 12:33:56 fedora automount[4289]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:33:56 fedora automount[4289]: lookup_read_master: lookup(file): read entry +auto.master
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_master: reading master files auto.master
Apr 27 12:33:56 fedora automount[4289]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_master: reading master sss auto.master
Apr 27 12:33:56 fedora automount[4289]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:33:56 fedora automount[4289]: master_do_mount: mounting /-
Apr 27 12:33:56 fedora automount[4289]: automount_path_to_fifo: fifo name /run/autofs.fifo--
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_map: reading map files auto.direct
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_map: reading map sss auto.direct
Apr 27 12:33:56 fedora automount[4289]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:33:56 fedora automount[4289]: mounted direct on /nfs/gandalf/bin with timeout 300, freq 75 seconds
Apr 27 12:33:56 fedora automount[4289]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/bin
Apr 27 12:33:56 fedora automount[4289]: mounted direct on /nfs/gandalf/etc with timeout 300, freq 75 seconds
Apr 27 12:33:56 fedora automount[4289]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/etc
Apr 27 12:33:56 fedora automount[4289]: mounted direct on /nfs/gandalf/man with timeout 300, freq 75 seconds
Apr 27 12:33:56 fedora automount[4289]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/man
Apr 27 12:33:56 fedora automount[4289]: mounted direct on /nfs/gandalf/test with timeout 300, freq 75 seconds
Apr 27 12:33:56 fedora automount[4289]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/test
Apr 27 12:33:56 fedora automount[4289]: mounted direct on /nfs/gandalf/sbin with timeout 300, freq 75 seconds
Apr 27 12:33:56 fedora automount[4289]: do_mount_autofs_direct: mounted trigger /nfs/gandalf/sbin
Apr 27 12:33:56 fedora automount[4289]: st_ready: st_ready(): state = 0 path /-
Apr 27 12:33:56 fedora automount[4289]: master_do_mount: mounting /test
Apr 27 12:33:56 fedora automount[4289]: automount_path_to_fifo: fifo name /run/autofs.fifo-test
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_map: reading map files auto.indirect
Apr 27 12:33:56 fedora automount[4289]: lookup_nss_read_map: reading map sss auto.indirect
Apr 27 12:33:56 fedora automount[4289]: do_init: parse(sun): init gathered global options: (null)
Apr 27 12:33:56 fedora automount[4289]: lookup_read_map: map read not needed, so not done
Apr 27 12:33:56 fedora automount[4289]: mounted indirect on /test with timeout 300, freq 75 seconds
Apr 27 12:33:56 fedora automount[4289]: st_ready: st_ready(): state = 0 path /test
Apr 27 12:33:56 fedora systemd[1]: Started Automounts filesystems on demand.

# autofs started ok, from sssd cached maps since server is still down.

# but lookup of /test/.test waits because backend host is down, as expected, sss_master_map_wait = 1.
Apr 27 12:34:12 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:34:12 fedora automount[4289]: handle_packet_missing_indirect: token 7, name .test, request pid 4305
Apr 27 12:34:12 fedora automount[4289]: attempting to mount entry /test/.test
Apr 27 12:34:12 fedora automount[4289]: lookup_mount: lookup(sss): looking up .test
Apr 27 12:34:12 fedora automount[4289]: can't contact sssd to to lookup key value, retry for 5 seconds
Apr 27 12:34:17 fedora automount[4289]: getautomntbyname: lookup(sss): getautomntbyname: Host is down
Apr 27 12:34:17 fedora automount[4289]: lookup(sss): lookup for .test failed: connection failed
Apr 27 12:34:17 fedora automount[4289]: key ".test" not found in map source(s).
Apr 27 12:34:17 fedora automount[4289]: dev_ioctl_send_fail: token = 7
Apr 27 12:34:17 fedora automount[4289]: failed to mount /test/.test
Apr 27 12:34:17 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:34:17 fedora automount[4289]: handle_packet_missing_indirect: token 8, name .test, request pid 4305
Apr 27 12:34:17 fedora automount[4289]: dev_ioctl_send_fail: token = 8

# ldap server started againi, but not seen by sssd yet.
Apr 27 12:40:48 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:40:48 fedora automount[4289]: handle_packet_missing_indirect: token 9, name .test, request pid 4449
Apr 27 12:40:48 fedora automount[4289]: attempting to mount entry /test/.test
Apr 27 12:40:48 fedora automount[4289]: lookup_mount: lookup(sss): looking up .test
Apr 27 12:40:48 fedora automount[4289]: can't contact sssd to to lookup key value, retry for 5 seconds
Apr 27 12:40:53 fedora automount[4289]: getautomntbyname: lookup(sss): getautomntbyname: Host is down
Apr 27 12:40:53 fedora automount[4289]: lookup(sss): lookup for .test failed: connection failed
Apr 27 12:40:53 fedora automount[4289]: key ".test" not found in map source(s).
Apr 27 12:40:53 fedora automount[4289]: dev_ioctl_send_fail: token = 9
Apr 27 12:40:53 fedora automount[4289]: failed to mount /test/.test

# ldap server started again but negative cached .test return fail.
Apr 27 12:40:53 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:40:53 fedora automount[4289]: handle_packet_missing_indirect: token 10, name .test, request pid 4449
Apr 27 12:40:53 fedora automount[4289]: dev_ioctl_send_fail: token = 10
Apr 27 12:41:10 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:41:10 fedora automount[4289]: handle_packet_missing_indirect: token 11, name .test, request pid 4457
Apr 27 12:41:10 fedora automount[4289]: dev_ioctl_send_fail: token = 11
Apr 27 12:41:10 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:41:10 fedora automount[4289]: handle_packet_missing_indirect: token 12, name .test, request pid 4457
Apr 27 12:41:10 fedora automount[4289]: dev_ioctl_send_fail: token = 12
Apr 27 12:42:00 fedora automount[4289]: handle_packet: type = 3

# try another key, works as expected.
Apr 27 12:42:00 fedora automount[4289]: handle_packet_missing_indirect: token 13, name .bogus, request pid 4472
Apr 27 12:42:00 fedora automount[4289]: attempting to mount entry /test/.bogus
Apr 27 12:42:00 fedora automount[4289]: lookup_mount: lookup(sss): looking up .bogus
Apr 27 12:42:02 fedora automount[4289]: key ".bogus" not found in map source(s).
Apr 27 12:42:02 fedora automount[4289]: dev_ioctl_send_fail: token = 13
Apr 27 12:42:02 fedora automount[4289]: failed to mount /test/.bogus
Apr 27 12:42:02 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:42:02 fedora automount[4289]: handle_packet_missing_indirect: token 14, name .bogus, request pid 4472
Apr 27 12:42:02 fedora automount[4289]: dev_ioctl_send_fail: token = 14
Apr 27 12:42:19 fedora automount[4289]: handle_packet: type = 3

# ldap server is seen as back up, works as expected.
Apr 27 12:42:19 fedora automount[4289]: handle_packet_missing_indirect: token 15, name .test, request pid 4479
Apr 27 12:42:19 fedora automount[4289]: attempting to mount entry /test/.test
Apr 27 12:42:19 fedora automount[4289]: lookup_mount: lookup(sss): looking up .test
Apr 27 12:42:20 fedora automount[4289]: key ".test" not found in map source(s).
Apr 27 12:42:20 fedora automount[4289]: dev_ioctl_send_fail: token = 15
Apr 27 12:42:20 fedora automount[4289]: failed to mount /test/.test
Apr 27 12:42:20 fedora automount[4289]: handle_packet: type = 3
Apr 27 12:42:20 fedora automount[4289]: handle_packet_missing_indirect: token 16, name .test, request pid 4479
Apr 27 12:42:20 fedora automount[4289]: dev_ioctl_send_fail: token = 16

The problem here is that in order to know the key is invalid autofs must contact
the sssd backend ldap server.

Comment 11 Ian Kent 2021-04-27 06:52:47 UTC
(In reply to Alexi Dimitriadis from comment #3)
> Created attachment 1775456 [details]
> LDAP Autofs Configuration

This looks a bit odd.

You have most of what's needed but I believe you need a container
entry for the auto.homeusers map ...

There's a container entry for auto.master:

dn: CN=auto.master,OU=automount,DC=DC=aa,DC=bb,DC=cc
objectClass: top
objectClass: nisMap
cn: auto.master
distinguishedName: CN=auto.master,OU=automount,DC=DC=aa,DC=bb,DC=cc
instanceType: 4
name: auto.master
nisMapName: auto.master

But there's no container entry for the map auto.homeusers.

Probably should be something like:

dn: CN=auto.homeusers,OU=automount,DC=DC=aa,DC=bb,DC=cc
objectClass: top
objectClass: nisMap
cn: auto.homeusers
distinguishedName: CN=auto.homeusers,OU=automount,DC=DC=aa,DC=bb,DC=cc
name: auto.homeusers
nisMapName: auto.homeusers

I'm not sure if this will actually make a difference for sssd.

Comment 12 Ian Kent 2021-04-27 06:58:55 UTC
(In reply to Alexi Dimitriadis from comment #4)
> Created attachment 1775457 [details]
> SSSD Autofs Configuration

I can't really comment on this as my sssd configuration is a little
different and I'm not very familiar with sssd configurations.

All I can say is I used RHEL documentation to create it:

[sssd]
domains = LDAP
services = autofs

[autofs]

[domain/LDAP]
id_provider = ldap
ldap_uri = ldap://centos7.themaw.net
autofs_provider=ldap
ldap_autofs_search_base=cn=automount,dc=themaw,dc=net
ldap_autofs_map_object_class=automountMap
ldap_autofs_entry_object_class=automount
ldap_autofs_map_name=automountMapName
ldap_autofs_entry_key=automountKey
ldap_autofs_entry_value=automountInformation

The different attribute names shouldn't make a difference, all that
should be needed is they be used consistently in the autofs ldap
entries.

Other than that I'm not sure.

Comment 13 Ian Kent 2021-04-27 07:05:32 UTC
(In reply to Alexi Dimitriadis from comment #8)
> Created attachment 1775774 [details]
> Autofs Debug Log

Pretty much useless, it's a log fragment not a full log with an example of
a successful lookup and mount as well as the failure.

Comment 14 Ian Kent 2021-04-27 07:12:00 UTC
And ...

> 
> dn: CN=auto.master,OU=automount,DC=DC=aa,DC=bb,DC=cc

what is this?

Shouldn't that be DC=aa,DC=bb,DC=cc?

And there's nowhere in that sssd-autofs-debug-1.txt that is searching
in the above domain.

Are you sure your sssd configuration is actually correct?

Comment 15 Glenn Morris 2021-06-10 17:42:56 UTC
I'm seeing what seems to be the same issue after updating from CentOS 8.3 to 8.4.
This took autofs from 5.1.4-43 to 5.1.4-48.
The relevant change is presumably one from https://bugzilla.redhat.com/show_bug.cgi?id=1892184

Comment 16 Glenn Morris 2021-06-10 19:27:26 UTC
With autofs debugging, I have something like this for a good lookup:

10:59:35 attempting to mount entry /home/someuser
10:59:35 lookup_mount: lookup(sss): looking up someuser
10:59:35 lookup_mount: lookup(sss): someuser -> some.host.org:/path/to/home/&

Whereas for a non-existent entry:

10:58:00 attempting to mount entry /home/BOGUS
10:58:00 lookup_mount: lookup(sss): looking up BOGUS
10:58:00 can't contact sssd to to lookup key value, retry for 5 seconds
10:58:05 getautomntbyname: lookup(sss): getautomntbyname: Host is down
10:58:05 lookup(sss): lookup for BOGUS failed: connection failed
10:58:05 key "BOGUS" not found in map source(s)

With sssd_debugging, I see (abbreviated) for a good lookup:

Looking up [auto.home:someuser] in data provider
Looking up [auto.home:someuser] in cache
Returning updated object [auto.home:someuser]
Found 1 entries in domain default
endautomntent called


And for a non-existent lookup:

Obtaining autofs entry auto.home:BOGUS
New request 'Get autofs entry'
Looking up [auto.home:BOGUS] in data provider
Looking up [auto.home:BOGUS] in cache
[...]
Object [auto.home:BOGUS] was not found in cache
Finished: Error 1432158212: SSSD is offline

SSSD is NOT offline. It seems like "no entry in map" is getting confused for "SSSD is offline"?

In https://bugzilla.redhat.com/show_bug.cgi?id=1648521#c33 I see comments:
"These patches will help autofs to distinguish between "sssd is offline" and "sssd did not found any autofs maps". Note that it also requires update in autofs which will land separately." referring to https://github.com/SSSD/sssd/pull/5343
I see now that this pull is mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1892184#c40

Comment 17 Glenn Morris 2021-06-10 21:11:40 UTC
I found a solution that seems to work for my use case.
I have two domains in sssd, for two classes of user accounts.
One authenticates with krb5, one with ldap, but apart from that they are identical.
In particular both domains use the same ldap directory for automount information.
I had duplicated the autofs sections for the two domains in sssd.conf.
Setting autofs_provider = none (rather than ldap) for one of the sssd domains has removed the slowdown for me when looking up bogus entries.

Sorry for the noise!

Comment 18 Ben Cotton 2022-05-12 16:52:22 UTC
This message is a reminder that Fedora Linux 34 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '34'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 34 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 19 Ben Cotton 2022-06-08 06:24:05 UTC
Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07.

Fedora Linux 34 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.