Created attachment 320726 [details] Debug log of shutdown period Description of problem: I did a "service NetworkManager stop", which ran my dispatch script to do "service autofs stop". Debug log before the crash: Oct 17 15:32:36 cynosure automount[19170]: shut down path /data4 Oct 17 15:32:36 cynosure automount[19170]: st_prepare_shutdown: state 1 path /nfs Oct 17 15:32:36 cynosure automount[19170]: expire_proc: exp_proc = 3082124176 path /nfs Oct 17 15:32:36 cynosure automount[19170]: expire_cleanup: got thid 3082124176 path /nfs stat 0 Oct 17 15:32:36 cynosure automount[19170]: expire_cleanup: sigchld: exp 3082124176 finished, switching from 5 to 7 Oct 17 15:32:36 cynosure automount[19170]: umount_multi: path /nfs incl 0 Oct 17 15:32:36 cynosure automount[19170]: rm_unwanted_fn: removing directory /nfs/web Oct 17 15:32:36 cynosure automount[19170]: rm_unwanted_fn: removing directory /nfs/local Oct 17 15:32:36 cynosure automount[19170]: rm_unwanted_fn: removing directory /nfs/intel Oct 17 15:32:36 cynosure automount[19170]: umounted indirect mount /nfs Oct 17 15:32:36 cynosure automount[19170]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-nfs Oct 17 15:32:36 cynosure automount[19170]: shut down path /nfs Oct 17 15:32:36 cynosure kernel: automount[19184]: segfault at 49a2d0 ip 0049a2d0 sp b7a563cc error 4 gdb: Program terminated with signal 11, Segmentation fault. [New process 19184] [New process 30690] [New process 19170] [New process 19172] [New process 19171] #0 0x0049a2d0 in ?? () (gdb) thr a a bt Thread 5 (process 19171): #0 0x0012e416 in __kernel_vsyscall () #1 0x00138ba5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread-2.8.so #2 0xb7fdfd6b in alarm_handler (arg=0x0) at alarm.c:184 #3 0x0013532f in start_thread (arg=<value optimized out>) at pthread_create.c:297 #4 0x0023920e in clone () from /lib/libc-2.8.so Thread 4 (process 19172): #0 0x0012e416 in __kernel_vsyscall () #1 0x00138ed2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread-2.8.so #2 0xb7fd9ae3 in st_queue_handler (arg=0x0) at state.c:965 #3 0x0013532f in start_thread (arg=<value optimized out>) at pthread_create.c:297 #4 0x0023920e in clone () from /lib/libc-2.8.so Thread 3 (process 19170): #0 0x0012e416 in __kernel_vsyscall () #1 0x0013ce40 in __sigwait (set=<value optimized out>, sig=<value optimized out>) at ../sysdeps/unix/sysv/linux/sigwait.c:63 #2 0xb7fcad64 in main (argc=0, argv=0xbfbf6118) at automount.c:1366 Thread 2 (process 30690): #0 0x0012e416 in __kernel_vsyscall () #1 0x0013b95c in __lll_unlock_wake () from /lib/libpthread-2.8.so #2 0x00137f7b in _L_unlock_90 () from /lib/libpthread-2.8.so #3 0x00137bbc in __pthread_mutex_unlock_usercnt (mutex=<value optimized out>, decr=<value optimized out>) at pthread_mutex_unlock.c:64 #4 0xb7fd9d2b in expire_cleanup (arg=0xb7b5734c) at state.c:206 #5 0xb7fce1e2 in expire_proc_indirect (arg=0xb81026b0) at /usr/include/pthread.h:583 #6 0x0013532f in start_thread (arg=<value optimized out>) at pthread_create.c:297 #7 0x0023920e in clone () from /lib/libc-2.8.so Thread 1 (process 19184): #0 0x0049a2d0 in ?? () #1 0x001353da in start_thread (arg=<value optimized out>) at pthread_create.c:154 #2 0x0023920e in clone () from /lib/libc-2.8.so Version-Release number of selected component (if applicable): autofs-5.0.3-17.1.i386
autofs-5.0.3-17.1.i386, with or without the patches posted in 465494?
With the patches.
(In reply to comment #2) > With the patches. Before we go further with this I need to complete the patches for 465494. I've completed the ldap uris locking patch but I need to think a little more about the library unload patch.
Created attachment 320786 [details] Patch to fix incorrect pthreads condition handling for expire requests But in the meantime I'd be interested in knowing if this patch makes any difference.
Okay, I've installed the with the updated patch and we'll see how that goes. So far so good.
I've worked on the patches from bug 465494 and I'm satisfied with what I now have but one of the patches relies on changes that have been committed to Rawhide and not to F-9 yet. Unfortunately, F-9 has got quite a bit behind. Anyway I've added the Rawhide updates, and the patches I hope fix both the issue here and the one from 465494, built autofs-5.0.3-27 and requested it be pushed to testing. You could also get the packages from Koji at https://koji.fedoraproject.org/koji/buildinfo?buildID=67199. Could you give this a try please. I'll update F-8 with the changes if all goes well. Ian
I've installed autofs-5.0.3-27 on my F-9 box and so far so good. Haven't seen a crash yet.
Still no crashes.
There was an out of date patch included in autofs-5.0.3-27. I've corrected this and requested build autofs-5.0.3-29 be pushed to testing, please update autofs and check the issue here is still resolved. The build can also be found at https://koji.fedoraproject.org/koji/buildinfo?buildID=69155 Ian
I'm seeing similar with autofs-5.0.3-30.x86_64 in current rawhide. Is this the same issue? Nov 18 10:08:44 test kernel: automount[2193]: segfault at 1401980 ip 0000000001401980 sp00007f940cb0b118 error 14 in libcrypto.so.0.9.8g[17fb000+140000] Nov 18 10:11:17 test kernel: automount[2424]: segfault at 1401980 ip 0000000001401980 sp00007f80ccfc3118 error 14 in libresolv-2.9.so[1761000+14000] Nov 18 10:15:03 test kernel: automount[2221]: segfault at 1401980 ip 0000000001401980 sp00007f1a99f5e118 error 14 in libcrypto.so.0.9.8g[19fe000+140000]
(In reply to comment #10) > I'm seeing similar with autofs-5.0.3-30.x86_64 in current rawhide. Is this the > same issue? > > Nov 18 10:08:44 test kernel: automount[2193]: segfault at 1401980 ip > 0000000001401980 sp00007f940cb0b118 error 14 in > libcrypto.so.0.9.8g[17fb000+140000] > Nov 18 10:11:17 test kernel: automount[2424]: segfault at 1401980 ip > 0000000001401980 sp00007f80ccfc3118 error 14 in libresolv-2.9.so[1761000+14000] > Nov 18 10:15:03 test kernel: automount[2221]: segfault at 1401980 ip > 0000000001401980 sp00007f1a99f5e118 error 14 in > libcrypto.so.0.9.8g[19fe000+140000] You should be using 5.0.4-2 in Rawhide. But I haven't added that latest libxml2 patch yet, pending feedback from yourself. If your happy with the results of the libxml2 patch I can update autofs and point you at a build? Ian
We're still building F-10 in rawhide, so I haven't seen 5.0.4-2. So far have been happy with the patched versions on 8/9. No crashes. Looks like 5.0.3-34 is the one to test for F-10?
Yes, but that also doesn't have the libxml2 fix yet. I should be able to go through and check (and update where needed) everything a bit later today. I'll let you know the revisions. Ian
I've updated F-9 and F-10 and requested they be pushed to stable including the F-8 update. The builds can be found at: F-10 https://koji.fedoraproject.org/koji/buildinfo?buildID=70188 F-9 https://koji.fedoraproject.org/koji/buildinfo?buildID=70190 Hopefully these will be OK. Ian
F-10 is looking good.
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.