Bug 803813 - automount keeps attempting to bind to ldap
automount keeps attempting to bind to ldap
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: autofs (Show other bugs)
16
All Linux
unspecified Severity medium
: ---
: ---
Assigned To: Ian Kent
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-15 13:07 EDT by Orion Poplawski
Modified: 2015-07-09 06:30 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-13 09:50:21 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
debug log (20.41 KB, application/x-gzip)
2012-04-03 13:13 EDT, Orion Poplawski
no flags Details
sssd logs (1.99 KB, application/x-gzip)
2012-05-02 12:41 EDT, Orion Poplawski
no flags Details
sssd_default.log (85.88 KB, text/plain)
2012-05-02 13:11 EDT, Orion Poplawski
no flags Details
autofs sss debug (206.44 KB, text/plain)
2012-06-05 23:56 EDT, Orion Poplawski
no flags Details

  None (edit)
Description Orion Poplawski 2012-03-15 13:07:01 EDT
Description of problem:

I'm seeing on our laptops when off network automount keep attempting to contact our ldap server:

Mar 15 07:07:01 paiute automount[2269]: bind_ldap_simple: lookup(ldap): Unable tAP server: (default), error Can't contact LDAP 
Mar 15 07:07:09 paiute automount[2269]: bind_ldap_simple: lookup(ldap): Unable tAP server: (default), error Can't contact LDAP 
Mar 15 07:07:09 paiute automount[2269]: key "local" no
Mar 15 07:09:02 paiute automount[2269]: bind_ldap_simple: lookup(ldap): Unable tAP server: (default), error Can't contact LDAP 
Mar 15 07:09:10 paiute automount[2269]: bind_ldap_simple: lookup(ldap): Unable tAP server: (default), error Can't contact LDAP 
Mar 15 07:09:10 paiute automount[2269]: key "local" no

This appears to be new behavior and appears to be causing problems with making path traversal slow (/nfs/local/bin is in the $PATH and automounted when on our network).

Version-Release number of selected component (if applicable):
autofs-5.0.6-5.fc16.x86_64
Comment 1 Ian Kent 2012-03-15 22:24:02 EDT
New behaviour from when?

Did you recently update autofs?
Any idea from what revision?
Have you recently updated the kernel?
Any idea from what revision?

Or is it that you've only just noticed it due to differing
usage patterns?

At first glance there doesn't appear to be recent changes
that would cause additional lookups. The only recent kernel
changes that may have found there way into the Fedora kernel
should make mount requests less likely so that probably isn't
it.

Could you provide a debug log of a session that has a problem.

Ian
Comment 2 Orion Poplawski 2012-03-16 11:05:42 EDT
It appears to be happing on two of my more recently installed (Feb 27/28) F16 laptops.  I install with updates enabled so probably something changed.  One difference I've found is in nsswitch.conf:

< bad, > good
< hosts:      files mdns4_minimal [NOTFOUND=return] dns
---
> hosts:      files dns

So I'm going to remove the mnds4_minimal entry and see if that makes a difference.  I'll also turn on debugging and post that.
Comment 3 Orion Poplawski 2012-04-03 13:13:16 EDT
Created attachment 574911 [details]
debug log

Here is the debug log from today.  We have /nfs/local in the path and /nfs is automounted.  If you are on a network, but not ours, the automounter continually tries to contact the ldap server.  Access to the ldap server is blocked by our firewall and the packets are simply dropped, which might be playing a role here too.

The 1920 pid around 10am is when I was testing.  Quite frequently when something tried to access /nfs/local there was a significant delay which is really unworkable.


I also noticed that restarting autofs took a very long time.  It appears that it would timeout trying to connect to the ldap server for each map.  It seems to me that automount should mark an ldap server unavailable.  If network conditions change, a reload or restart could be done.  This was at Apr  3 10:21:11.
Comment 4 Orion Poplawski 2012-04-03 13:28:07 EDT
Maybe it looks it up again after "expiring":

Apr  3 10:14:04 paiute automount[1920]: handle_packet: type = 3
Apr  3 10:14:04 paiute automount[1920]: handle_packet_missing_indirect: token 35, name local, request pid 2341
Apr  3 10:14:04 paiute automount[1920]: attempting to mount entry /nfs/local
Apr  3 10:14:04 paiute automount[1920]: lookup_name_file_source_instance: file map not found
Apr  3 10:14:04 paiute automount[1920]: lookup_mount: lookup(ldap): looking up local
Apr  3 10:14:04 paiute automount[1920]: do_bind: lookup(ldap): auth_required: 1, sasl_mech (null)
Apr  3 10:14:05 paiute automount[1920]: st_expire: state 1 path /nfs
Apr  3 10:14:05 paiute automount[1920]: expire_proc: exp_proc = 139984337409792 path /nfs
Apr  3 10:14:05 paiute automount[1920]: expire_cleanup: got thid 139984337409792 path /nfs stat 0
Apr  3 10:14:05 paiute automount[1920]: expire_cleanup: sigchld: exp 139984337409792 finished, switching from 2 to 1
Apr  3 10:14:05 paiute automount[1920]: st_ready: st_ready(): state = 2 path /nfs
Apr  3 10:14:18 paiute automount[1920]: bind_ldap_simple: lookup(ldap): Unable to bind to the LDAP server: (default), error Can't contact LDAP server
Apr  3 10:14:18 paiute automount[1920]: do_bind: lookup(ldap): ldap simple bind returned -1
Apr  3 10:14:18 paiute automount[1920]: do_bind: lookup(ldap): auth_required: 1, sasl_mech (null)
Apr  3 10:14:26 paiute automount[1920]: bind_ldap_simple: lookup(ldap): Unable to bind to the LDAP server: (default), error Can't contact LDAP server
Apr  3 10:14:26 paiute automount[1920]: do_bind: lookup(ldap): ldap simple bind returned -1
Apr  3 10:14:26 paiute automount[1920]: lookup(ldap): couldn't connect to server default
Apr  3 10:14:26 paiute automount[1920]: lookup(ldap): lookup for local failed: connection failed
Apr  3 10:14:26 paiute automount[1920]: key "local" not found in map source(s).
Apr  3 10:14:26 paiute automount[1920]: dev_ioctl_send_fail: token = 35
Comment 5 Orion Poplawski 2012-04-09 11:51:31 EDT
Any progress here?  Anything else I can do to help debug?  This is becoming a big problem for at least one of our users and I may need to implement a workaround if this is going to take a long time to address.  Thanks!
Comment 6 Ian Kent 2012-04-09 21:52:04 EDT
(In reply to comment #5)
> Any progress here?  Anything else I can do to help debug?  This is becoming a
> big problem for at least one of our users and I may need to implement a
> workaround if this is going to take a long time to address.  Thanks!

Not really.

Just so I understand, you do have LDAP as a map source in
nsswitch.conf, right? And it appears to you that something has
changed because the perceived behaviour is different?

The log looks like it should to me and that part of the code
hasn't changed for a while.

You could try setting LDAP_TIMEOUT to a value other than the
LDAP library default. The LDAP_NETWORK_TIMEOUT should already
be 8 seconds but you might like to lower that as well.

Let me know if that makes any difference.
Comment 7 Ian Kent 2012-04-09 22:22:36 EDT
(In reply to comment #6)
> (In reply to comment #5)
> > Any progress here?  Anything else I can do to help debug?  This is becoming a
> > big problem for at least one of our users and I may need to implement a
> > workaround if this is going to take a long time to address.  Thanks!
> 
> Not really.

But ...

> 
> The log looks like it should to me and that part of the code
> hasn't changed for a while.

There is an unconditional retry of a failed connection which
is only actually needed if using SASL. I can at least get rid
of that extra bind attempt for connections not using SASL.

> 
> You could try setting LDAP_TIMEOUT to a value other than the
> LDAP library default. The LDAP_NETWORK_TIMEOUT should already
> be 8 seconds but you might like to lower that as well.
> 

Which, together with this, might help.

Ian
Comment 8 Orion Poplawski 2012-04-10 11:22:18 EDT
(In reply to comment #6)
> Just so I understand, you do have LDAP as a map source in
> nsswitch.conf, right? And it appears to you that something has
> changed because the perceived behaviour is different?

Yes.  We now (say with F14 -> F16) see longish delays when programs walk PATH because of looking up /nfs/local.  Unfortunately I have yet to fully determine what it takes to reproduce it.

> The log looks like it should to me and that part of the code
> hasn't changed for a while.

I'm a little surprised that it appears to try to bind to ldap again after startup, apparently after expire processing.  Is that by design?  I guess I can see arguments both ways, but at least in this case we're never going to be able to contact the server.

> You could try setting LDAP_TIMEOUT to a value other than the
> LDAP library default. The LDAP_NETWORK_TIMEOUT should already
> be 8 seconds but you might like to lower that as well.
> 
> Let me know if that makes any difference.

Will do.
Comment 9 Ian Kent 2012-04-11 04:46:53 EDT
(In reply to comment #8)
> 
> > The log looks like it should to me and that part of the code
> > hasn't changed for a while.
> 
> I'm a little surprised that it appears to try to bind to ldap again after
> startup, apparently after expire processing.  Is that by design?  I guess I can
> see arguments both ways, but at least in this case we're never going to be able
> to contact the server.

As I say there is a retry in there that can, and should, go.

The idea behind binding to the LDAP server for each operation
is that, if there are a large number of clients, maintaining
a constant connection to a server for every client would cause
large number of socket connections to the server and that would
unnecessarily load the server.
Comment 10 Orion Poplawski 2012-04-11 09:45:11 EDT
(In reply to comment #9)
> The idea behind binding to the LDAP server for each operation
> is that, if there are a large number of clients, maintaining
> a constant connection to a server for every client would cause
> large number of socket connections to the server and that would
> unnecessarily load the server.

That makes sense.  What I'm suggesting would be to flag a server as down if you can't connect to it to prevent subsequent attempts to connect which leads to the long startup times as noted at the end of comment #3.

I've also discovered what has changed in our configuration that is triggering this - I have permanent entries in auto.master now:

# cat auto.master
/home   auto.home       fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,actimeo=1,noatime
/data   auto.data4      fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
/nfs    auto.nfs        fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
+auto.master

This is somewhat temporary as we transition to kerberized nfs, but I expect it to take a while.  Before we just had the +auto.master entry and so it wouldn't have any maps off network.
Comment 11 Ian Kent 2012-04-11 21:16:41 EDT
> (In reply to comment #9)
> > The idea behind binding to the LDAP server for each operation
> > is that, if there are a large number of clients, maintaining
> > a constant connection to a server for every client would cause
> > large number of socket connections to the server and that would
> > unnecessarily load the server.
>
> That makes sense.  What I'm suggesting would be to flag a server as down if you
> can't connect to it to prevent subsequent attempts to connect which leads to
> the long startup times as noted at the end of comment #3.

Yeah, that's not as simple as it sounds.

The lookup modules are "isolated".

Reading the master map will open the relevant lookup module(s)
and read the master map then close it since that is done once
at startup and once again when a HUP signal is received. Then
each map will open it's own lookup module for reading the map
or lookups, since the map can come from any source.

So, while I need to eliminate the additional bind attempt that's
made (which would half the time), knowing that a given server is
not contactable needs to have a separate shared implementation
like an LDAP connection pooling system. The main difficulty is
knowing when to try again. Point is that isn't something that
can be done quickly.

But the recently added sssd support should do what you need.
I've found a couple of problems recently which I need to apply
to F16 and push out and I'm seeing another odd problem between
autofs and sssd for one of the sssd entry points but I need to
sort this out. The point is we have already spent the time on
it and we need people to use it to identify and iron out any
problems.

What schema do you LDAP maps use?
Comment 12 Ian Kent 2012-04-11 21:26:28 EDT
(In reply to comment #11)
> 
> But the recently added sssd support should do what you need.
> I've found a couple of problems recently which I need to apply
> to F16 and push out and I'm seeing another odd problem between
> autofs and sssd for one of the sssd entry points but I need to
> sort this out. The point is we have already spent the time on
> it and we need people to use it to identify and iron out any
> problems.

Oh dear, my mistake, this is only in F17 however the sssd bits
are included in F16 and back porting the autofs bits should be
fairly straight forward.

Are you in a position to give this a go?
Comment 13 Orion Poplawski 2012-04-12 10:59:59 EDT
Sure.  Always happy to help test things out.  I use the "Other common LDAP naming":

MAP_OBJECT_CLASS="automountMap"
ENTRY_OBJECT_CLASS="automount"
MAP_ATTRIBUTE="ou"
ENTRY_ATTRIBUTE="cn"
VALUE_ATTRIBUTE="automountInformation"
Comment 14 Ian Kent 2012-04-12 20:16:17 EDT
(In reply to comment #13)
> Sure.  Always happy to help test things out.  I use the "Other common LDAP
> naming":
> 
> MAP_OBJECT_CLASS="automountMap"
> ENTRY_OBJECT_CLASS="automount"
> MAP_ATTRIBUTE="ou"
> ENTRY_ATTRIBUTE="cn"
> VALUE_ATTRIBUTE="automountInformation"

You can set schema attributes in the sss configuration so that
should be OK.

I have found a problem with the interface during testing.
It was present then went away and has come back so I'll
need to resolve that as part of the back port.

I'll report back later.
Comment 15 Jakub Hrozek 2012-04-13 09:54:53 EDT
(In reply to comment #14)
> (In reply to comment #13)
> > Sure.  Always happy to help test things out.  I use the "Other common LDAP
> > naming":
> > 
> > MAP_OBJECT_CLASS="automountMap"
> > ENTRY_OBJECT_CLASS="automount"
> > MAP_ATTRIBUTE="ou"
> > ENTRY_ATTRIBUTE="cn"
> > VALUE_ATTRIBUTE="automountInformation"
> 
> You can set schema attributes in the sss configuration so that
> should be OK.
> 

What LDAP schema do you use for user and group lookups? The attribute mapping as above is the default for the RFC2307 schema. The RFC2307bis schema uses slightly different mapping, but it's trivial to override the attribute names.
Comment 16 Orion Poplawski 2012-04-13 10:47:23 EDT
(In reply to comment #15)
> What LDAP schema do you use for user and group lookups? The attribute mapping
> as above is the default for the RFC2307 schema. The RFC2307bis schema uses
> slightly different mapping, but it's trivial to override the attribute names.

I don't know, I've never had to change any configuration from the default and have been using ldap and sssd/ldap for a while now.  Perhaps this is enough to tell:

dn: uid=orion,ou=People,dc=nwra,dc=com
objectClass: posixAccount
uidNumber: 1744
gidNumber: 1001

dn: cn=cora,ou=Groups,dc=nwra,dc=com
objectClass: posixGroup
gidNumber: 1001
Comment 17 Jakub Hrozek 2012-04-13 15:58:51 EDT
(In reply to comment #16)
> (In reply to comment #15)
> > What LDAP schema do you use for user and group lookups? The attribute mapping
> > as above is the default for the RFC2307 schema. The RFC2307bis schema uses
> > slightly different mapping, but it's trivial to override the attribute names.
> 
> I don't know, I've never had to change any configuration from the default and
> have been using ldap and sssd/ldap for a while now.  Perhaps this is enough to
> tell:
> 
> dn: uid=orion,ou=People,dc=nwra,dc=com
> objectClass: posixAccount
> uidNumber: 1744
> gidNumber: 1001
> 
> dn: cn=cora,ou=Groups,dc=nwra,dc=com
> objectClass: posixGroup
> gidNumber: 1001

The SSSD default is RFC2307, so unless your /etc/sssd/sssd.conf file contains the "ldap_schema" option set to "rfc2307bis", you are using RFC2307. Which means the SSSD defaults for that schema should just work.

FTR, the difference between the two schemas is mostly in how the group members are handled. When using the rfc2307 schema, group members are listed by user name in the memberUid attribute. When using the rfc2307bis schema, group members are listed by DN and stored in the member (or sometimes uniqueMember) attribute.
Comment 18 Jakub Hrozek 2012-05-01 12:04:29 EDT
Hi Orion, have you had a chance to test out the sssd/autofs integration? I wrote a user-facing document that might be helpful to set things up:
http://jhrozek.livejournal.com/2500.html
Comment 19 Orion Poplawski 2012-05-01 12:56:51 EDT
Looks like you need 5.0.6-16.fc17 for it to load libsss_autofs.so properly.  But it's not working for me:

# cat /etc/auto.master
/home   auto.home       fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,actimeo=1,noatime
/data   auto.data4      fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
/nfs    auto.nfs        fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
+auto.master

May  1 10:51:01 vmf17 sssd[autofs]: Starting up
May  1 10:51:03 vmf17 automount[557]: Starting automounter version 5.0.6-16.fc17, master map auto.master
May  1 10:51:03 vmf17 automount[557]: using kernel protocol version 5.02
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_master: reading master files auto.master
May  1 10:51:03 vmf17 automount[557]: parse_init: parse(sun): init gathered global options: (null)
May  1 10:51:03 vmf17 automount[557]: spawn_mount: mtab link detected, passing -n to mount
May  1 10:51:03 vmf17 automount[557]: spawn_umount: mtab link detected, passing -n to mount
May  1 10:51:03 vmf17 automount[557]: lookup_read_master: lookup(file): read entry /home
May  1 10:51:03 vmf17 automount[557]: lookup_read_master: lookup(file): read entry /data
May  1 10:51:03 vmf17 automount[557]: lookup_read_master: lookup(file): read entry /nfs
May  1 10:51:03 vmf17 automount[557]: lookup_read_master: lookup(file): read entry +auto.master
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_master: reading master files auto.master
May  1 10:51:03 vmf17 automount[557]: parse_init: parse(sun): init gathered global options: (null)
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_master: reading master sss auto.master
May  1 10:51:03 vmf17 automount[557]: parse_init: parse(sun): init gathered global options: (null)
May  1 10:51:03 vmf17 automount[557]: lookup_read_master: lookup(sss): getautomntent_r: No such file or directory
May  1 10:51:03 vmf17 automount[557]: lookup(file): failed to read included master map auto.master
May  1 10:51:03 vmf17 automount[557]: master_do_mount: mounting /home
May  1 10:51:03 vmf17 automount[557]: automount_path_to_fifo: fifo name /run/autofs.fifo-home
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_map: reading map files auto.home
May  1 10:51:03 vmf17 automount[557]: file map /etc/auto.home not found
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_map: reading map sss auto.home
May  1 10:51:03 vmf17 automount[557]: parse_init: parse(sun): init gathered global options: fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,actimeo=1,noatime
May  1 10:51:03 vmf17 automount[557]: spawn_mount: mtab link detected, passing -n to mount
May  1 10:51:03 vmf17 automount[557]: spawn_umount: mtab link detected, passing -n to mount
May  1 10:51:03 vmf17 automount[557]: lookup_read_map: lookup(sss): getautomntent_r: No such file or directory
May  1 10:51:03 vmf17 automount[557]: mounted indirect on /home with timeout 300, freq 75 seconds
May  1 10:51:03 vmf17 automount[557]: st_ready: st_ready(): state = 0 path /home
May  1 10:51:03 vmf17 automount[557]: ghosting enabled
May  1 10:51:03 vmf17 automount[557]: master_do_mount: mounting /data
May  1 10:51:03 vmf17 automount[557]: automount_path_to_fifo: fifo name /run/autofs.fifo-data
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_map: reading map files auto.data4
May  1 10:51:03 vmf17 automount[557]: file map /etc/auto.data4 not found
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_map: reading map sss auto.data4
May  1 10:51:03 vmf17 automount[557]: parse_init: parse(sun): init gathered global options: fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
May  1 10:51:03 vmf17 automount[557]: lookup_read_map: lookup(sss): getautomntent_r: No such file or directory
May  1 10:51:03 vmf17 automount[557]: mounted indirect on /data with timeout 300, freq 75 seconds
May  1 10:51:03 vmf17 automount[557]: st_ready: st_ready(): state = 0 path /data
May  1 10:51:03 vmf17 automount[557]: ghosting enabled
May  1 10:51:03 vmf17 automount[557]: master_do_mount: mounting /nfs
May  1 10:51:03 vmf17 automount[557]: automount_path_to_fifo: fifo name /run/autofs.fifo-nfs
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_map: reading map files auto.nfs
May  1 10:51:03 vmf17 automount[557]: file map /etc/auto.nfs not found
May  1 10:51:03 vmf17 automount[557]: lookup_nss_read_map: reading map sss auto.nfs
May  1 10:51:03 vmf17 automount[557]: parse_init: parse(sun): init gathered global options: fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
May  1 10:51:03 vmf17 automount[557]: lookup_read_map: lookup(sss): getautomntent_r: No such file or directory
May  1 10:51:03 vmf17 automount[557]: mounted indirect on /nfs with timeout 300, freq 75 seconds
May  1 10:51:03 vmf17 automount[557]: st_ready: st_ready(): state = 0 path /nfs
May  1 10:51:03 vmf17 automount[557]: ghosting enabled
May  1 10:51:15 vmf17 puppet-agent[653]: (/File[/etc/sysconfig/autofs]/content) content changed '{md5}0fbfcc9f1492db5105a52692301ffdcc' to '{md5}9c6d51d8a8ec59f9620005f914687cdd'
May  1 10:51:30 vmf17 automount[557]: handle_packet: type = 3
May  1 10:51:30 vmf17 automount[557]: handle_packet_missing_indirect: token 1, name orion, request pid 1118
May  1 10:51:30 vmf17 automount[557]: attempting to mount entry /home/orion
May  1 10:51:30 vmf17 automount[557]: lookup_name_file_source_instance: file map not found
May  1 10:51:30 vmf17 automount[557]: lookup_mount: lookup(sss): looking up orion
May  1 10:51:30 vmf17 automount[557]: key "orion" not found in map source(s).
May  1 10:51:30 vmf17 automount[557]: dev_ioctl_send_fail: token = 1
May  1 10:51:30 vmf17 automount[557]: handle_packet: type = 3
May  1 10:51:30 vmf17 automount[557]: handle_packet_missing_indirect: token 2, name orion, request pid 1118
May  1 10:51:30 vmf17 automount[557]: failed to mount /home/orion
May  1 10:51:30 vmf17 automount[557]: attempting to mount entry /home/orion
May  1 10:51:30 vmf17 automount[557]: lookup_name_file_source_instance: file map not found
May  1 10:51:30 vmf17 automount[557]: lookup_mount: lookup(sss): looking up orion
May  1 10:51:30 vmf17 automount[557]: dev_ioctl_send_fail: token = 2
May  1 10:51:30 vmf17 automount[557]: failed to mount /home/orion


Also, I worried that sss caching may not help because it will lead to clients trying to access nfs shares off network.  We'll see I guess.
Comment 20 Jakub Hrozek 2012-05-02 04:49:48 EDT
(In reply to comment #19)
> May  1 10:51:30 vmf17 automount[557]: lookup_mount: lookup(sss): looking up
> orion
> May  1 10:51:30 vmf17 automount[557]: key "orion" not found in map source(s).

This seems to be the culprit.. Does your LDAP map use the "&" wildcard or is there an entry for each user?
Comment 21 Orion Poplawski 2012-05-02 12:08:24 EDT
(In reply to comment #20)
> (In reply to comment #19)
> > May  1 10:51:30 vmf17 automount[557]: lookup_mount: lookup(sss): looking up
> > orion
> > May  1 10:51:30 vmf17 automount[557]: key "orion" not found in map source(s).
> 
> This seems to be the culprit.. Does your LDAP map use the "&" wildcard or is
> there an entry for each user?

There is an entry for each user.  It doesn't seem to find any of the entries as /home is empty (and I'm using ghosting).  

dn: cn=orion,ou=auto.home,dc=cora,dc=nwra,dc=com
objectClass: automount
objectClass: top
cn: orion
automountInformation: earth:/export/home/orion
Comment 22 Jakub Hrozek 2012-05-02 12:18:49 EDT
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #19)
> > > May  1 10:51:30 vmf17 automount[557]: lookup_mount: lookup(sss): looking up
> > > orion
> > > May  1 10:51:30 vmf17 automount[557]: key "orion" not found in map source(s).
> > 
> > This seems to be the culprit.. Does your LDAP map use the "&" wildcard or is
> > there an entry for each user?
> 
> There is an entry for each user.  It doesn't seem to find any of the entries as
> /home is empty (and I'm using ghosting).  
> 
> dn: cn=orion,ou=auto.home,dc=cora,dc=nwra,dc=com
> objectClass: automount
> objectClass: top
> cn: orion
> automountInformation: earth:/export/home/orion

Can you attach your sssd.conf and the SSSD log files? I'd like to check whether the SSSD is actually downloading the automounter entries.
Comment 23 Orion Poplawski 2012-05-02 12:41:13 EDT
Created attachment 581663 [details]
sssd logs

# cat /etc/sssd/sssd.conf 
[domain/default]

ldap_id_use_start_tls = True
cache_credentials = True
ldap_search_base = dc=nwra,dc=com
krb5_realm = CORA.NWRA.COM
krb5_server = kerberos.cora.nwra.com
id_provider = ldap
auth_provider = krb5
chpass_provider = krb5
ldap_uri = ldap://ldap.cora.nwra.com/
krb5_kpasswd = kerberos.cora.nwra.com
ldap_tls_cacertdir = /etc/openldap/cacerts
[sssd]
services = nss, pam, autofs
config_file_version = 2
ldap_autofs_search_base="dc=cora,dc=nwra,dc=com"

domains = default
debug_level = 0xFF0

[nss]

[pam]

[sudo]

[autofs]
debug_level = 0xFF0

[ssh]
Comment 24 Jakub Hrozek 2012-05-02 13:03:10 EDT
The sssd_domain.log was empty (sorry, I forgot to explicitly ask for the domain logs..), would you mind attaching the domain logs as well?

These lines from the autofs logs tell me that the back end did not find any entries for the map auto.home:
----------
(Wed May  2 10:33:56 2012) [sssd[autofs]] [lookup_automntmap_step] (0x0400): Requesting info for [auto.home@default]
(Wed May  2 10:33:56 2012) [sssd[autofs]] [sysdb_autofs_entries_by_map] (0x0400): Getting entries for map auto.home
(Wed May  2 10:33:56 2012) [sssd[autofs]] [sysdb_autofs_entries_by_map] (0x0400): No entries for the map
(Wed May  2 10:33:56 2012) [sssd[autofs]] [lookup_automntmap_step] (0x0400): setautomntent done for map auto.home
----------

The ldap_autofs_search_base option belongs to the [domain/default] section, not [sssd]. The domain inherited the value of ldap_search_base, however, which should work, just search a bigger subtree.
Comment 25 Orion Poplawski 2012-05-02 13:11:24 EDT
Created attachment 581671 [details]
sssd_default.log

Here you go
Comment 26 Jakub Hrozek 2012-05-08 06:56:59 EDT
(In reply to comment #25)
> Created attachment 581671 [details]
> sssd_default.log
> 
> Here you go

Sorry Orion, this fell off my radar. It seems that the SSSD cannot parse the provided autofs search base. It seems that the "" quotes seem to be causing trouble to the configuration parser. Can you try again without the quotes?
Comment 27 Orion Poplawski 2012-06-04 12:55:44 EDT
Looks good without the quotes.  Now to see if this helps my original problem at all or not.
Comment 28 Orion Poplawski 2012-06-05 23:56:51 EDT
Created attachment 589703 [details]
autofs sss debug

Here is the autofs debug log using sss.  This is perhaps somewhat better as it doesn't hang trying to access the ldap server, though I still see some ldap messages:

Jun  5 08:39:45 titmouse automount[13451]: lookup_nss_read_map: reading map ldap ldap://ldap.cora.nwra.com/ou=auto.data4,dc=cora,dc=nwra,dc=com
Jun  5 08:39:45 titmouse automount[13451]: parse_server_string: lookup(ldap): Attempting to parse LDAP information from string "ldap://ldap.cora.nwra.com/ou=auto.data4,dc=cora,dc=nwra,dc=com".
Jun  5 08:39:45 titmouse automount[13451]: parse_server_string: lookup(ldap): server "ldap://ldap.cora.nwra.com/", base dn "ou=auto.data4,dc=cora,dc=nwra,dc=com"
Jun  5 08:39:45 titmouse automount[13451]: parse_ldap_config: lookup(ldap): ldap authentication configured with the following options:
Jun  5 08:39:45 titmouse automount[13451]: parse_ldap_config: lookup(ldap): use_tls: 0, tls_required: 0, auth_required: 1, sasl_mech: (null)
Jun  5 08:39:45 titmouse automount[13451]: parse_ldap_config: lookup(ldap): user: (null), secret: unspecified, client principal: (null) credential cache: (null)
Jun  5 08:39:45 titmouse automount[13451]: parse_init: parse(sun): init gathered global options: fstype=nfs4,sec=krb5,intr,rsize=32768,wsize=32768,noatime
Jun  5 08:39:46 titmouse automount[13451]: do_bind: lookup(ldap): auth_required: 1, sasl_mech (null)
Jun  5 08:39:46 titmouse automount[13451]: bind_ldap_simple: lookup(ldap): Unable to bind to the LDAP server: , error Can't contact LDAP server
Jun  5 08:39:46 titmouse automount[13451]: do_bind: lookup(ldap): ldap simple bind returned -1
Jun  5 08:39:46 titmouse automount[13451]: do_bind: lookup(ldap): auth_required: 1, sasl_mech (null)
Jun  5 08:39:46 titmouse automount[13451]: bind_ldap_simple: lookup(ldap): Unable to bind to the LDAP server: , error Can't contact LDAP server
Jun  5 08:39:46 titmouse automount[13451]: do_bind: lookup(ldap): ldap simple bind returned -1
Jun  5 08:39:46 titmouse automount[13451]: lookup(ldap): couldn't connect to server ldap://ldap.cora.nwra.com/


The problem now is that it tries to mount nfs entries that it doesn't have access to off of the internal network.  At least it tries once and then is marked invalid for a while, but I don't think it's a workable solution for me.  I think I still just need to disable autofs off of our internal network.
Comment 29 Fedora End Of Life 2013-01-16 08:54:39 EST
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 30 Fedora End Of Life 2013-02-13 09:50:24 EST
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.