678168 – Problems reloading/restarting with ldap.

Bug 678168 - Problems reloading/restarting with ldap.

Summary: Problems reloading/restarting with ldap.

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	autofs
Sub Component:
Version:	15
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Ian Kent
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-02-16 23:35 UTC by Orion Poplawski
Modified:	2012-08-07 19:51 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-08-07 19:51:45 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
automount debug info (30.71 KB, text/plain) 2011-02-16 23:35 UTC, Orion Poplawski	no flags	Details
View All

Description Orion Poplawski 2011-02-16 23:35:40 UTC

Created attachment 479235 [details]
automount debug info

Description of problem:

Couple issues.  automount is started before networking is up so it doesn't get the additional auto.master entries from ldap (already a bug for this).  Then:

- systemctl reload autofs.service seems to do nothing at all
- kill -HUP pid gets it to connect to the ldap server, but it doesn't appear to activate any of the new maps (auto.home for /home, auto.data for /data).
- At this point I try to restart autofs, but that hangs (or perhaps just takes a long time).  Log shows automount[927]: do_notify_state: signal 15

Version-Release number of selected component (if applicable):
autofs-5.0.5-35.fc15.x86_64

Comment 1 Ian Kent 2011-02-17 01:16:48 UTC

Interesting, I'll have a look as soon as I can install f15, that
will be a couple of days unfortunately.

Comment 2 Ian Kent 2011-02-19 04:03:38 UTC

(In reply to comment #0)
> Created attachment 479235 [details]
> automount debug info
> 
> Description of problem:
> 
> Couple issues.  automount is started before networking is up so it doesn't get
> the additional auto.master entries from ldap (already a bug for this).  Then:

I can't seem to find a suitable bug for the startup problem, so
I think we'll work on that here as well.

I did add a patch for it but then reverted it.
After reading this bug I did some more work on it but I'm not
going add it or do anything else to the package until the
2.6.38-rc problems are resolved. Until that happens there
is no way to know if this is a user space problem or kernel
problem.

At this point I believe the kernel problems are due to the
new rcu-walk patches and not the vfs-automount patches that
went into 2.6.38. In fact it's a bit of a nightmare tying to
work out what the problems are. I've been working on this
since 2.6.38-rc1.

Ian

Comment 3 Orion Poplawski 2011-02-21 21:58:00 UTC

Startup issue is same as bug 448510 I think.

Comment 4 Ian Kent 2011-02-22 02:44:49 UTC

(In reply to comment #3)
> Startup issue is same as bug 448510 I think.

I'll have another look at the change I've done for autofs
to re-try reading the master map at startup and make a test
build if I still think it's OK.

Could the systemctl problem be systemd and not autofs?

Comment 5 Orion Poplawski 2011-03-15 17:52:50 UTC

Playing with it some more it seems that systemctl reload and kill -HUP do the same thing, but that doesn't appear sufficient to get autofs to start the other LDAP defined maps.  This is what automount does in response to reload:

Mar 15 11:49:46 vmrawhide automount[847]: re-reading master map auto.master
Mar 15 11:49:46 vmrawhide automount[847]: lookup_nss_read_master: reading master files auto.master
Mar 15 11:49:46 vmrawhide automount[847]: parse_init: parse(sun): init gathered global options: (null)
Mar 15 11:49:46 vmrawhide automount[847]: lookup_read_master: lookup(file): read entry /misc
Mar 15 11:49:46 vmrawhide automount[847]: lookup_read_master: lookup(file): read entry /net
Mar 15 11:49:46 vmrawhide automount[847]: lookup_read_master: lookup(file): read entry +auto.master
Mar 15 11:49:46 vmrawhide automount[847]: lookup_nss_read_master: reading master files auto.master
Mar 15 11:49:46 vmrawhide automount[847]: parse_init: parse(sun): init gathered global options: (null)
Mar 15 11:49:46 vmrawhide automount[847]: lookup_nss_read_master: reading master ldap auto.master
Mar 15 11:49:46 vmrawhide automount[847]: parse_server_string: lookup(ldap): Attempting to parse LDAP information from string "auto.master".
Mar 15 11:49:46 vmrawhide automount[847]: parse_server_string: lookup(ldap): mapname auto.master
Mar 15 11:49:46 vmrawhide automount[847]: parse_ldap_config: lookup(ldap): ldap authentication configured with the following options:
Mar 15 11:49:46 vmrawhide automount[847]: parse_ldap_config: lookup(ldap): use_tls: 0, tls_required: 0, auth_required: 1, sasl_mech: (null)
Mar 15 11:49:46 vmrawhide automount[847]: parse_ldap_config: lookup(ldap): user: (null), secret: unspecified, client principal: (null) credential cache: (null)
Mar 15 11:49:46 vmrawhide automount[847]: parse_init: parse(sun): init gathered global options: (null)
Mar 15 11:49:46 vmrawhide automount[847]: do_bind: lookup(ldap): auth_required: 1, sasl_mech (null)
Mar 15 11:49:46 vmrawhide automount[847]: do_bind: lookup(ldap): ldap simple bind returned 0
Mar 15 11:49:46 vmrawhide automount[847]: get_query_dn: lookup(ldap): query succeeded, no matches for (&(objectclass=nisMap)(nisMapName=auto.master))
Mar 15 11:49:46 vmrawhide automount[847]: get_query_dn: lookup(ldap): found query dn ou=auto.master,dc=cora,dc=nwra,dc=com
Mar 15 11:49:46 vmrawhide automount[847]: lookup_read_master: lookup(ldap): searching for "(objectclass=automount)" under "ou=auto.master,dc=cora,dc=nwra,dc=com"
Mar 15 11:49:46 vmrawhide automount[847]: lookup_read_master: lookup(ldap): examining entries

stuck there.

Comment 6 Ian Kent 2011-03-16 03:49:53 UTC

(In reply to comment #5)
> Playing with it some more it seems that systemctl reload and kill -HUP do the
> same thing, but that doesn't appear sufficient to get autofs to start the other
> LDAP defined maps.  This is what automount does in response to reload:

snip ...

> 
> stuck there.

Can you get a gdb back trace of automount please.

I need a "thr a a bt" so I can see what all the processes are
doing. Also, we need the corresponding autofs-debuginfo package
installed so the back trace includes line numbers, otherwise
I can't tell where each thread is at.

btw, as far as the interface not being up when the NetworkManager
init script completes there's a setting to prevent that. In
/etc/sysconfig/network try adding NETWORKWAIT=yes and if that
still isn't enough add NETWORKDELAY=<seconds> to delay a
further <seconds> after NetworkManager thinks the network
is up.

Comment 7 Orion Poplawski 2011-03-16 15:18:30 UTC

Thread 5 (Thread 0x7fe1ffeb6700 (LWP 834)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
#1  0x00007fe1ffedae1c in alarm_handler (arg=<optimized out>) at alarm.c:206
#2  0x00007fe1ffa83d0b in start_thread (arg=0x7fe1ffeb6700) at pthread_create.c:301
#3  0x00007fe1fe9889ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 4 (Thread 0x7fe1ffea5700 (LWP 835)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007fe1ffed1eab in st_queue_handler (arg=<optimized out>) at state.c:1074
#2  0x00007fe1ffa83d0b in start_thread (arg=0x7fe1ffea5700) at pthread_create.c:301
#3  0x00007fe1fe9889ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 3 (Thread 0x7fe1fe485700 (LWP 857)):
#0  0x00007fe1fe980123 in __poll (fds=<optimized out>, nfds=<optimized out>, 
    timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007fe1ffec5f10 in get_pkt (pkt=0x7fe1fe484c20, ap=0x7fe200ca76c0) at automount.c:882
#2  handle_packet (ap=0x7fe200ca76c0) at automount.c:1019
#3  0x00007fe1ffec721a in handle_mounts (arg=<optimized out>) at automount.c:1551
#4  0x00007fe1ffa83d0b in start_thread (arg=0x7fe1fe485700) at pthread_create.c:301
#5  0x00007fe1fe9889ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 2 (Thread 0x7fe1fd195700 (LWP 860)):
#0  0x00007fe1fe980123 in __poll (fds=<optimized out>, nfds=<optimized out>, 
    timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007fe1ffec5f10 in get_pkt (pkt=0x7fe1fd194c20, ap=0x7fe200cada10) at automount.c:882
#2  handle_packet (ap=0x7fe200cada10) at automount.c:1019
#3  0x00007fe1ffec721a in handle_mounts (arg=<optimized out>) at automount.c:1551
#4  0x00007fe1ffa83d0b in start_thread (arg=0x7fe1fd195700) at pthread_create.c:301
#5  0x00007fe1fe9889ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 1 (Thread 0x7fe1ffe89720 (LWP 833)):
#0  do_sigwait (set=<optimized out>, sig=0x7fff85af61e8)
    at ../sysdeps/unix/sysv/linux/sigwait.c:65
#1  0x00007fe1ffa8b2a9 in __sigwait (set=<optimized out>, sig=<optimized out>)
    at ../sysdeps/unix/sysv/linux/sigwait.c:100
#2  0x00007fe1ffec4719 in statemachine (arg=<optimized out>) at automount.c:1327
#3  main (argc=1300288646, argv=<optimized out>) at automount.c:2166

Let me know if you want any more details

Comment 8 Ian Kent 2011-03-16 16:44:45 UTC

(In reply to comment #7)

snip ...

> 
> Let me know if you want any more details

This doesn't look like a hung automount process.
In fact it looks normal.

Comment 9 Orion Poplawski 2011-03-16 17:01:46 UTC

Yeah, I guess it isn't hung at this point.  I didn't start the ldap defined maps though.

Attaching to the process during shutdown (which takes a while shows:

#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007fe1ffedaec1 in alarm_handler (arg=<optimized out>) at alarm.c:186
#2  0x00007fe1ffa83d0b in start_thread (arg=0x7fe1ffeb6700) at pthread_create.c:301
#3  0x00007fe1fe9889ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 2 (Thread 0x7fe1ffea5700 (LWP 835)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007fe1ffed1eab in st_queue_handler (arg=<optimized out>) at state.c:1074
#2  0x00007fe1ffa83d0b in start_thread (arg=0x7fe1ffea5700) at pthread_create.c:301
#3  0x00007fe1fe9889ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 1 (Thread 0x7fe1ffe89720 (LWP 833)):
#0  do_sigwait (set=<optimized out>, sig=0x7fff85af61e8)
    at ../sysdeps/unix/sysv/linux/sigwait.c:65
#1  0x00007fe1ffa8b2a9 in __sigwait (set=<optimized out>, sig=<optimized out>)
    at ../sysdeps/unix/sysv/linux/sigwait.c:100
#2  0x00007fe1ffec4719 in statemachine (arg=<optimized out>) at automount.c:1327
#3  main (argc=1300288646, argv=<optimized out>) at automount.c:2166

(gdb) c
Continuing.
[New Thread 0x7fe1fc474700 (LWP 1377)]
[Thread 0x7fe1fc474700 (LWP 1377) exited]
[New Thread 0x7fe1fc474700 (LWP 1383)]
[Thread 0x7fe1fc474700 (LWP 1383) exited]
[New Thread 0x7fe1fc474700 (LWP 1389)]
....

logs show:

Mar 16 10:58:17 vmrawhide automount[833]: shut down path /net
Mar 16 10:58:17 vmrawhide automount[833]: do_notify_state: signal 15
Mar 16 10:58:20 vmrawhide automount[833]: do_notify_state: signal 15
Mar 16 10:58:23 vmrawhide automount[833]: do_notify_state: signal 15
Mar 16 10:58:26 vmrawhide automount[833]: do_notify_state: signal 15
Mar 16 10:58:29 vmrawhide automount[833]: do_notify_state: signal 15
.....

Comment 10 Fedora End Of Life 2012-08-07 19:51:48 UTC

This message is a notice that Fedora 15 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 15. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. At this time, all open bugs with a Fedora 'version'
of '15' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we were unable to fix it before Fedora 15 reached end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged to click on
"Clone This Bug" (top right of this page) and open it against that
version of Fedora.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Note You need to log in before you can comment on or make changes to this bug.