Bug 521961 - /usr/bin/pulseaudio hangs in loop, 100% user mode
Summary: /usr/bin/pulseaudio hangs in loop, 100% user mode
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: pulseaudio
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Lennart Poettering
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-08 21:12 UTC by Valdis Kletnieks
Modified: 2009-09-21 12:54 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-21 12:54:52 UTC


Attachments (Terms of Use)
"thread apply all bt full" output of hung pulseaudio (14.54 KB, text/plain)
2009-09-14 18:12 UTC, Valdis Kletnieks
no flags Details

Description Valdis Kletnieks 2009-09-08 21:12:29 UTC
Description of problem:
Pulseaudio hangs in an infinite loop, 100% user mode.  GDB says it's here:

(gdb) where
#0  0x00007f4671fdb5d3 in pa_mainloop_iterate (m=0x11a5920, 
    block=<value optimized out>, retval=0x7fffe9810bbc) at pulse/mainloop.c:953
#1  0x00007f4671fdb680 in pa_mainloop_run (m=0x11a5920, retval=0x7fffe9810bbc)
    at pulse/mainloop.c:968
#2  0x0000000000409aa2 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at daemon/main.c:924
(gdb) info locals
r = <value optimized out>
__func__ = "pa_mainloop_iterate"
__PRETTY_FUNCTION__ = "pa_mainloop_iterate"
(gdb) up 
#1  0x00007f4671fdb680 in pa_mainloop_run (m=0x11a5920, retval=0x7fffe9810bbc)
    at pulse/mainloop.c:968
968         while ((r = pa_mainloop_iterate(m, 1, retval)) >= 0);
(gdb) info locals
r = <value optimized out>
(gdb) up
#2  0x0000000000409aa2 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at daemon/main.c:924
924         if (pa_mainloop_run(mainloop, &retval) < 0)
(gdb) info locals
c = <value optimized out>
buf = <value optimized out>
conf = <value optimized out>
mainloop = <value optimized out>
s = <value optimized out>
r = <value optimized out>
retval = 0
d = 3
valid_pid_file = true
ltdl_init = <value optimized out>
passed_fd = <value optimized out>
e = <value optimized out>
daemon_pipe = {-1, -1}
autospawn_fd = -1
autospawn_locked = false
dbus = <value optimized out>
__func__ = "main"
__PRETTY_FUNCTION__ = "main"
(gdb) quit

Version-Release number of selected component (if applicable):
pulseaudio-0.9.16-13.test7.fc12.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Lennart Poettering 2009-09-09 20:58:58 UTC
Hmpf. Could you make that a full bt over all threads? ('thread apply all bt full')

Also, what did you do to trigger this?

Can you also please provide the output of 'pulseaudio -vvvv' when this happens?

Comment 2 Valdis Kletnieks 2009-09-09 21:34:05 UTC
I've only had it bite a few times, and don't have a reliable reproducer. Will get that info next time I hit it.

Comment 3 Valdis Kletnieks 2009-09-14 18:12:26 UTC
Created attachment 360982 [details]
"thread apply all bt full" output of hung pulseaudio

For what it's worth, it appears based on the PID that pulseaudio had *already* been restarted at least once during this login session (for comparison, the window manager PID is 2619, so this pulseaudio started some 24,000 processes later. Checking the logs:

Sep 14 09:54:50 turing-police pulseaudio[13261]: pid.c: Stale PID file, overwriting.
Sep 14 15:44:34 turing-police rtkit-daemon[2336]: Sucessfully made thread 2334 of process 2334 ((unreachable)/usr/bin/pulseaudio) owned by 'valdis' high priority at nice level -11.
Sep 14 16:08:39 turing-police rtkit-daemon[2336]: Sucessfully made thread 26811 of process 26811 ((unreachable)/usr/bin/pulseaudio) owned by 'valdis' high priority at nice level -11.
Sep 14 12:08:39 turing-police pulseaudio[26811]: pid.c: Stale PID file, overwriting.

The first one is from when I logged in - no indication what the restart at 12:08 was. It wedged and I did the gdb at 12:55.

Comment 4 Michal Schmidt 2009-09-16 20:34:18 UTC
Today while I was away from my laptop for a minute pulseaudio got into a similar state, eating 100% CPU time making no syscalls.

With gdb I found that it looped in pa_mainloop_iterate(). In pa_mainloop_poll() m->n_enabled_defer_events always equaled 2, so no poll call was ever made. dispatch_defer() never dispatched anything, because the head of the list (e = m->defer_events) had e->enabled==false and e->next was NULL.

This is with pulseaudio-0.9.17-1.fc12.x86_64.
I don't know how to reproduce this. The only thing that could have produced any sound in that time was the Jabber client "jabbim" which spawns aplay when a message is received.

Comment 5 Michal Schmidt 2009-09-16 20:37:41 UTC
I forgot to add that m->defer_events->callback points to defer_callback (pulsecore/pstream.c:216)

Comment 6 Michal Schmidt 2009-09-21 10:12:20 UTC
Infinite loop in userspace was one of the symptoms I was able to reproduce using the testcase with many concurrent aplay processes in bug 523855, which is now fixed in 0.9.18. So perhaps we should close this bug unless it can be reproduced.

Comment 7 Valdis Kletnieks 2009-09-21 12:50:28 UTC
It's only been a day since I installed 0.9.18, but I've not seen a recurrence.  Let's go ahead and close this one as "fixed in 0.9.18", and I'll re-open it if I see it happen again...


Note You need to log in before you can comment on or make changes to this bug.