Bug 667770
| Summary: | rpc.gssd locks up and hangs nfs mount when idle for long time (ticket expires?) | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Orion Poplawski <orion> |
| Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
| Status: | CLOSED CANTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 14 | CC: | bcodding, ender, jlayton, matt, steved, warren |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-03-13 19:47:13 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Orion Poplawski
2011-01-06 18:13:52 UTC
Back trace of hung process:
#0 0x00ae8416 in __kernel_vsyscall ()
#1 0x004be5d1 in __lll_lock_wait_private () from /lib/libc.so.6
#2 0x0044985c in _L_lock_12621 () from /lib/libc.so.6
#3 0x00447797 in malloc () from /lib/libc.so.6
#4 0x0043a398 in open_memstream () from /lib/libc.so.6
#5 0x004a9ae5 in __vsyslog_chk () from /lib/libc.so.6
#6 0x0017d15f in vsyslog (kind=512,
fmt=0x1806cc "dir_notify_handler: sig %d si %p data %p\n", args=0xbfc03938 "%")
at /usr/include/bits/syslog.h:48
#7 xlog_backend (kind=512, fmt=0x1806cc "dir_notify_handler: sig %d si %p data %p\n",
args=0xbfc03938 "%") at xlog.c:150
#8 0x001777d4 in printerr (priority=2,
format=0x1806cc "dir_notify_handler: sig %d si %p data %p\n") at err_util.c:64
#9 0x00177c9e in dir_notify_handler (sig=37, si=0xbfc0396c, data=0xbfc039ec)
at gssd_main_loop.c:66
#10 <signal handler called>
#11 0x00444984 in _int_malloc () from /lib/libc.so.6
#12 0x004477a0 in malloc () from /lib/libc.so.6
#13 0x0046db77 in __alloc_dir () from /lib/libc.so.6
#14 0x0046dc5a in opendir () from /lib/libc.so.6
#15 0x0046e7ef in scandir64@@GLIBC_2.2 () from /lib/libc.so.6
#16 0x00179285 in process_pipedir () at gssd_proc.c:565
#17 update_client_list () at gssd_proc.c:594
#18 0x00177f40 in gssd_run () at gssd_main_loop.c:216
#19 0x00177bf9 in main (argc=2, argv=0xbfc04134) at gssd.c:187
Looks like malloc is getting called from a signal handler called while in a malloc call, which is verboten. Not sure what the best way around this, but it looks like dir_notify_handler cannot call printerr. I suppose this only occurs when -vv or greater is given. What's even more hilarious is that when nfs hangs, your entire gnome session freezes. I thought the solution was to drop the printerr call: http://article.gmane.org/gmane.linux.nfs/45443 Why was this closed - cantfix? We just ran into this one in RHEL6. |