467521 – segfault on autofs shutdown.

Bug 467521 - segfault on autofs shutdown.

Summary: segfault on autofs shutdown.

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	autofs
Sub Component:
Version:	9
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Ian Kent
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-10-17 21:47 UTC by Orion Poplawski
Modified:	2009-07-14 16:59 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-07-14 16:59:28 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Debug log of shutdown period (17.21 KB, text/plain) 2008-10-17 21:47 UTC, Orion Poplawski	no flags	Details
Patch to fix incorrect pthreads condition handling for expire requests (4.57 KB, patch) 2008-10-19 04:04 UTC, Ian Kent	no flags	Details \| Diff
View All

Description Orion Poplawski 2008-10-17 21:47:13 UTC

Created attachment 320726 [details]
Debug log of shutdown period

Description of problem:

I did a "service NetworkManager stop", which ran my dispatch script to do "service autofs stop".

Debug log before the crash:

Oct 17 15:32:36 cynosure automount[19170]: shut down path /data4
Oct 17 15:32:36 cynosure automount[19170]: st_prepare_shutdown: state 1 path /nfs
Oct 17 15:32:36 cynosure automount[19170]: expire_proc: exp_proc = 3082124176 path /nfs
Oct 17 15:32:36 cynosure automount[19170]: expire_cleanup: got thid 3082124176 path /nfs stat 0
Oct 17 15:32:36 cynosure automount[19170]: expire_cleanup: sigchld: exp 3082124176 finished, switching from 5 to 7
Oct 17 15:32:36 cynosure automount[19170]: umount_multi: path /nfs incl 0
Oct 17 15:32:36 cynosure automount[19170]: rm_unwanted_fn: removing directory /nfs/web
Oct 17 15:32:36 cynosure automount[19170]: rm_unwanted_fn: removing directory /nfs/local
Oct 17 15:32:36 cynosure automount[19170]: rm_unwanted_fn: removing directory /nfs/intel
Oct 17 15:32:36 cynosure automount[19170]: umounted indirect mount /nfs
Oct 17 15:32:36 cynosure automount[19170]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-nfs
Oct 17 15:32:36 cynosure automount[19170]: shut down path /nfs
Oct 17 15:32:36 cynosure kernel: automount[19184]: segfault at 49a2d0 ip 0049a2d0 sp b7a563cc error 4

gdb:
Program terminated with signal 11, Segmentation fault.
[New process 19184]
[New process 30690]
[New process 19170]
[New process 19172]
[New process 19171]
#0  0x0049a2d0 in ?? ()
(gdb) thr a a bt

Thread 5 (process 19171):
#0  0x0012e416 in __kernel_vsyscall ()
#1  0x00138ba5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread-2.8.so
#2  0xb7fdfd6b in alarm_handler (arg=0x0) at alarm.c:184
#3  0x0013532f in start_thread (arg=<value optimized out>) at pthread_create.c:297
#4  0x0023920e in clone () from /lib/libc-2.8.so

Thread 4 (process 19172):
#0  0x0012e416 in __kernel_vsyscall ()
#1  0x00138ed2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread-2.8.so
#2  0xb7fd9ae3 in st_queue_handler (arg=0x0) at state.c:965
#3  0x0013532f in start_thread (arg=<value optimized out>) at pthread_create.c:297
#4  0x0023920e in clone () from /lib/libc-2.8.so

Thread 3 (process 19170):
#0  0x0012e416 in __kernel_vsyscall ()
#1  0x0013ce40 in __sigwait (set=<value optimized out>, sig=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/sigwait.c:63
#2  0xb7fcad64 in main (argc=0, argv=0xbfbf6118) at automount.c:1366

Thread 2 (process 30690):
#0  0x0012e416 in __kernel_vsyscall ()
#1  0x0013b95c in __lll_unlock_wake () from /lib/libpthread-2.8.so
#2  0x00137f7b in _L_unlock_90 () from /lib/libpthread-2.8.so
#3  0x00137bbc in __pthread_mutex_unlock_usercnt (mutex=<value optimized out>,
    decr=<value optimized out>) at pthread_mutex_unlock.c:64
#4  0xb7fd9d2b in expire_cleanup (arg=0xb7b5734c) at state.c:206
#5  0xb7fce1e2 in expire_proc_indirect (arg=0xb81026b0) at /usr/include/pthread.h:583
#6  0x0013532f in start_thread (arg=<value optimized out>) at pthread_create.c:297
#7  0x0023920e in clone () from /lib/libc-2.8.so

Thread 1 (process 19184):
#0  0x0049a2d0 in ?? ()
#1  0x001353da in start_thread (arg=<value optimized out>) at pthread_create.c:154
#2  0x0023920e in clone () from /lib/libc-2.8.so


Version-Release number of selected component (if applicable):
autofs-5.0.3-17.1.i386

Comment 1 Ian Kent 2008-10-18 06:20:14 UTC

autofs-5.0.3-17.1.i386, with or without the patches posted in 465494?

Comment 2 Orion Poplawski 2008-10-18 14:08:48 UTC

With the patches.

Comment 3 Ian Kent 2008-10-19 03:39:59 UTC

(In reply to comment #2)
> With the patches.

Before we go further with this I need to complete the patches
for 465494. I've completed the ldap uris locking patch but I
need to think a little more about the library unload patch.

Comment 4 Ian Kent 2008-10-19 04:04:09 UTC

Created attachment 320786 [details]
Patch to fix incorrect pthreads condition handling for expire requests

But in the meantime I'd be interested in knowing if this
patch makes any difference.

Comment 5 Orion Poplawski 2008-10-21 17:13:10 UTC

Okay, I've installed the with the updated patch and we'll see how that goes.  So far so good.

Comment 6 Ian Kent 2008-10-23 08:38:25 UTC

I've worked on the patches from bug 465494 and I'm satisfied with
what I now have but one of the patches relies on changes that
have been committed to Rawhide and not to F-9 yet. Unfortunately,
F-9 has got quite a bit behind. Anyway I've added the Rawhide
updates, and the patches I hope fix both the issue here and the
one from 465494, built autofs-5.0.3-27 and requested it be pushed
to testing. You could also get the packages from Koji at
https://koji.fedoraproject.org/koji/buildinfo?buildID=67199.

Could you give this a try please.
I'll update F-8 with the changes if all goes well.

Ian

Comment 7 Orion Poplawski 2008-10-28 21:22:02 UTC

I've installed autofs-5.0.3-27 on my F-9 box and so far so good.   Haven't seen a crash yet.

Comment 8 Orion Poplawski 2008-10-30 17:34:59 UTC

Still no crashes.

Comment 9 Ian Kent 2008-11-11 06:28:41 UTC

There was an out of date patch included in autofs-5.0.3-27.
I've corrected this and requested build autofs-5.0.3-29 be
pushed to testing, please update autofs and check the issue
here is still resolved.

The build can also be found at
https://koji.fedoraproject.org/koji/buildinfo?buildID=69155

Ian

Comment 10 Orion Poplawski 2008-11-18 17:17:10 UTC

I'm seeing similar with autofs-5.0.3-30.x86_64 in current rawhide.  Is this the same issue?

Nov 18 10:08:44 test kernel: automount[2193]: segfault at 1401980 ip 0000000001401980 sp00007f940cb0b118 error 14 in libcrypto.so.0.9.8g[17fb000+140000]
Nov 18 10:11:17 test kernel: automount[2424]: segfault at 1401980 ip 0000000001401980 sp00007f80ccfc3118 error 14 in libresolv-2.9.so[1761000+14000]
Nov 18 10:15:03 test kernel: automount[2221]: segfault at 1401980 ip 0000000001401980 sp00007f1a99f5e118 error 14 in libcrypto.so.0.9.8g[19fe000+140000]

Comment 11 Ian Kent 2008-11-19 03:37:08 UTC

(In reply to comment #10)
> I'm seeing similar with autofs-5.0.3-30.x86_64 in current rawhide.  Is this the
> same issue?
> 
> Nov 18 10:08:44 test kernel: automount[2193]: segfault at 1401980 ip
> 0000000001401980 sp00007f940cb0b118 error 14 in
> libcrypto.so.0.9.8g[17fb000+140000]
> Nov 18 10:11:17 test kernel: automount[2424]: segfault at 1401980 ip
> 0000000001401980 sp00007f80ccfc3118 error 14 in libresolv-2.9.so[1761000+14000]
> Nov 18 10:15:03 test kernel: automount[2221]: segfault at 1401980 ip
> 0000000001401980 sp00007f1a99f5e118 error 14 in
> libcrypto.so.0.9.8g[19fe000+140000]

You should be using 5.0.4-2 in Rawhide.
But I haven't added that latest libxml2 patch yet, pending
feedback from yourself.

If your happy with the results of the libxml2 patch I can update
autofs and point you at a build?

Ian

Comment 12 Orion Poplawski 2008-11-19 05:00:07 UTC

We're still building F-10 in rawhide, so I haven't seen 5.0.4-2.  So far have been happy with the patched versions on 8/9.  No crashes.  Looks like 5.0.3-34 is the one to test for F-10?

Comment 13 Ian Kent 2008-11-19 05:36:42 UTC

Yes, but that also doesn't have the libxml2 fix yet.
I should be able to go through and check (and update where needed) 
everything a bit later today. I'll let you know the revisions.

Ian

Comment 14 Ian Kent 2008-11-19 07:37:36 UTC

I've updated F-9 and F-10 and requested they be pushed to stable
including the F-8 update.

The builds can be found at:
F-10 https://koji.fedoraproject.org/koji/buildinfo?buildID=70188
F-9 https://koji.fedoraproject.org/koji/buildinfo?buildID=70190

Hopefully these will be OK.
Ian

Comment 15 Orion Poplawski 2008-11-20 16:31:24 UTC

F-10 is looking good.

Comment 16 Bug Zapper 2009-06-10 03:00:24 UTC

This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 17 Bug Zapper 2009-07-14 16:59:28 UTC

Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.