Bug 437830 - evolution hangs doing: "formatting message"
Summary: evolution hangs doing: "formatting message"
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: evolution
Version: 8
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Matthew Barnes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-03-17 16:28 UTC by James Antill
Modified: 2008-08-02 23:40 UTC (History)
1 user (show)

Fixed In Version: evolution-0:2.12.3-5.fc8
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-07-08 16:39:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description James Antill 2008-03-17 16:28:33 UTC
Description of problem:
 I'm running evolution-0:2.12.3-3.fc8.x86_64 and quite often now it seems to
hang with "formatting message" in the status bar. I've only have this problem
for about a week or so. From the log files:

/var/log/yum.log-20080129:Jan 25 11:21:42 Updated: evolution-2.12.3-1.fc8.x86_64
/var/log/yum.log-20080129:Jan 28 19:32:37 Updated:
beagle-evolution-0.2.18-4.fc8.x86_64
/var/log/yum.log-20080312:Mar 07 21:36:54 Updated: evolution-2.12.3-3.fc8.x86_64

...I've also started running beagle "recently", as that doesn't eat all my CPU
now ... so maybe it's related to that?

 I had thought it was related to having evolution-zimbra installed, so I
uninstalled that over the weekend ... but it still does it.

 This morning I tried doing "taskset -p 1 <evolution-tids>" (as I have two
cores) and so far that seems to have helped, so maybe just a normal deadlock
somewhere.

Comment 1 Milan Crha 2008-03-17 16:44:31 UTC
Does it hang on some particular message, say with special attachment like
v-calendar, or is it some HTML mail with images?

When it gets hang, can you attach gdb to the "running" Evolution process and
paste here output of "thread apply all bt" command from gdb? It will show us
where it gets stuck. Thanks in advance.

Comment 2 James Antill 2008-03-17 17:06:40 UTC
 It's happened on a number of different messages, and doing "evolution
--force-shutdown" then running it again, and selecting the same message has
worked every time. I have HTML email rendering turned off, so it's not that.

 As I said, it hasn't happened since I did the taskset thing ... so hopefully
it's not going to happen again, until I reboot/re-taskset, which I'm not dying
to do right now. I'll try and get you that data when I am running it on multiple
CPUs again, assuming it hangs.


Comment 3 Milan Crha 2008-03-17 17:24:50 UTC
OK, thanks for the info. I see I forgot to mention whether you can install debug
info packages for gtkhtml, evolution and evolution-data-server, (and maybe
evolution-exchange if you use it) so the traces will have symbols.

Comment 4 James Antill 2008-03-17 17:40:32 UTC
Yeh, I guessed that'd help ... so I already did a "debuginfo-install evolution"
so I should be covered.


Comment 5 James Antill 2008-03-22 11:07:56 UTC
(gdb) info threads
  5 Thread 1094719824 (LWP 29521)  0x0000003b0eacbd66 in __poll (fds=0x864bc0, 
    nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
  4 Thread 1147169104 (LWP 29543)  0x0000003b0eacbd66 in __poll (fds=0xecea00, 
    nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
  3 Thread 1147435344 (LWP 29544)  0x0000003b0eacbd66 in __poll (fds=0xed3930, 
    nfds=7, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
  2 Thread 1126189392 (LWP 11246)  0x0000003b0f60a8f9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1 Thread 46912496455616 (LWP 29516)  0x0000003b0eacbd66 in __poll (
    fds=0xd14250, nfds=5, timeout=124) at ../sysdeps/unix/sysv/linux/poll.c:87

 I'm guessing this is the problem:

(gdb) thread 2
[Switching to thread 2 (Thread 1126189392 (LWP 11246))]#0  0x0000003b0f60a8f9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
Current language:  auto; currently asm
(gdb) bt
#0  0x0000003b0f60a8f9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x000000301fa1c8df in ?? () from /usr/lib64/libebook-1.2.so.9
#2  0x000000301fa1cadb in e_book_get_contacts ()
   from /usr/lib64/libebook-1.2.so.9
#3  0x00002aaab0dbcc51 in em_utils_contact_photo (cia=<value optimized out>, 
    local=<value optimized out>) at em-utils.c:2092
#4  0x00002aaab0daa994 in efh_format_message (emf=0x82d150, stream=0xe8dc70, 
    part=0x2aaabc7d2d98, info=<value optimized out>) at em-format-html.c:1930
#5  0x00002aaab0da9640 in efh_format_exec (m=0xef0470) at em-format-html.c:1254
#6  0x00002aaab0dcac7a in mail_msg_proxy (msg=0xef0470) at mail-mt.c:500
#7  0x00000030e3e52669 in g_thread_pool_thread_proxy (
    data=<value optimized out>) at gthreadpool.c:265
#8  0x00000030e3e50b24 in g_thread_create_proxy (data=0xecb370)
    at gthread.c:635
#9  0x0000003b0f606407 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003b0ead4b0d in clone () from /lib64/libc.so.6





But here's the full list anyway:























(gdb) thread apply all bt

Thread 5 (Thread 1094719824 (LWP 29521)):
#0  0x0000003b0eacbd66 in __poll (fds=0x864bc0, nfds=2, timeout=-1)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000030e3e3209e in g_main_context_iterate (context=0x85fe80, block=1, 
    dispatch=1, self=<value optimized out>) at gmain.c:2996
#2  0x00000030e3e3255a in IA__g_main_loop_run (loop=0x85b990) at gmain.c:2898
#3  0x00000030eaa068c3 in libnm_glib_dbus_worker (user_data=0x854750)
    at libnm_glib.c:427
#4  0x00000030e3e50b24 in g_thread_create_proxy (data=0x85bae0)
    at gthread.c:635
#5  0x0000003b0f606407 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003b0ead4b0d in clone () from /lib64/libc.so.6
Current language:  auto; currently c

Thread 4 (Thread 1147169104 (LWP 29543)):
#0  0x0000003b0eacbd66 in __poll (fds=0xecea00, nfds=1, timeout=-1)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000030e3e3209e in g_main_context_iterate (context=0xe9bca0, block=1, 
    dispatch=1, self=<value optimized out>) at gmain.c:2996
#2  0x00000030e3e3255a in IA__g_main_loop_run (loop=0xecdac0) at gmain.c:2898
#3  0x000000301fa181fd in ?? () from /usr/lib64/libebook-1.2.so.9
#4  0x00000030e3e50b24 in g_thread_create_proxy (data=0xe9bd80)
    at gthread.c:635
---Type <return> to continue, or q <return> to quit---
#5  0x0000003b0f606407 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003b0ead4b0d in clone () from /lib64/libc.so.6

Thread 3 (Thread 1147435344 (LWP 29544)):
#0  0x0000003b0eacbd66 in __poll (fds=0xed3930, nfds=7, timeout=-1)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000030e3e3209e in g_main_context_iterate (context=0xd153d0, block=1, 
    dispatch=1, self=<value optimized out>) at gmain.c:2996
#2  0x00000030e3e3255a in IA__g_main_loop_run (loop=0xd1ba60) at gmain.c:2898
#3  0x00000030e8a463b0 in link_io_thread_fn (data=<value optimized out>)
    at linc.c:396
#4  0x00000030e3e50b24 in g_thread_create_proxy (data=0x8bfb80)
    at gthread.c:635
#5  0x0000003b0f606407 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003b0ead4b0d in clone () from /lib64/libc.so.6

Thread 2 (Thread 1126189392 (LWP 11246)):
#0  0x0000003b0f60a8f9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x000000301fa1c8df in ?? () from /usr/lib64/libebook-1.2.so.9
#2  0x000000301fa1cadb in e_book_get_contacts ()
   from /usr/lib64/libebook-1.2.so.9
#3  0x00002aaab0dbcc51 in em_utils_contact_photo (cia=<value optimized out>, 
---Type <return> to continue, or q <return> to quit---
    local=<value optimized out>) at em-utils.c:2092
#4  0x00002aaab0daa994 in efh_format_message (emf=0x82d150, stream=0xe8dc70, 
    part=0x2aaabc7d2d98, info=<value optimized out>) at em-format-html.c:1930
#5  0x00002aaab0da9640 in efh_format_exec (m=0xef0470) at em-format-html.c:1254
#6  0x00002aaab0dcac7a in mail_msg_proxy (msg=0xef0470) at mail-mt.c:500
#7  0x00000030e3e52669 in g_thread_pool_thread_proxy (
    data=<value optimized out>) at gthreadpool.c:265
#8  0x00000030e3e50b24 in g_thread_create_proxy (data=0xecb370)
    at gthread.c:635
#9  0x0000003b0f606407 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003b0ead4b0d in clone () from /lib64/libc.so.6
Current language:  auto; currently asm

Thread 1 (Thread 46912496455616 (LWP 29516)):
#0  0x0000003b0eacbd66 in __poll (fds=0xd14250, nfds=5, timeout=124)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000030e3e3209e in g_main_context_iterate (context=0x65f520, block=1, 
    dispatch=1, self=<value optimized out>) at gmain.c:2996
#2  0x00000030e3e3255a in IA__g_main_loop_run (loop=0x6a0e00) at gmain.c:2898
#3  0x00000030ebe2ce16 in bonobo_main () at bonobo-main.c:311
#4  0x0000000000415cfb in main (argc=<value optimized out>, 
    argv=0x7fffeaaf4338) at main.c:602
#5  0x0000003b0ea1e074 in __libc_start_main (main=0x4159b0 <main>, argc=1, 
    ubp_av=0x7fffeaaf4338, init=<value optimized out>, 
---Type <return> to continue, or q <return> to quit---
    fini=<value optimized out>, rtld_fini=<value optimized out>, 
    stack_end=0x7fffeaaf4328) at libc-start.c:220
#6  0x0000000000409dd9 in _start ()
Current language:  auto; currently c
#0  0x0000003b0f60a8f9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
Current language:  auto; currently asm


Comment 6 Matthew Barnes 2008-03-22 13:57:53 UTC
I think you're right, looks like it's stuck waiting for a response from
evolution-data-server.  Can you try to get a backtrace of the
evolution-data-server process when the hang occurs?  Hopefully that will reveal
the /real/ cause of the hang.

Comment 7 James Antill 2008-03-24 14:31:45 UTC
(gdb) bt
#0  0x0000003b0eacbd66 in __poll (fds=0x8a5ea4, nfds=1, timeout=10)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00002aaab0c40e84 in ldap_result ()
   from /usr/lib64/evolution-data-server-1.2/extensions/libebookbackendldap.so
#2  0x00002aaab0c3d6d5 in ?? ()
   from /usr/lib64/evolution-data-server-1.2/extensions/libebookbackendldap.so
#3  0x00000030e3e2f68b in g_timeout_dispatch (source=0x6b96f0, callback=0x1, 
    user_data=0xa) at gmain.c:3488
#4  0x00000030e3e2ef53 in IA__g_main_context_dispatch (context=0x61b120)
    at gmain.c:2061
#5  0x00000030e3e3224d in g_main_context_iterate (context=0x61b120, block=1, 
    dispatch=1, self=<value optimized out>) at gmain.c:2694
#6  0x00000030e3e3255a in IA__g_main_loop_run (loop=0x63efa0) at gmain.c:2898
#7  0x00000030ebe2ce16 in bonobo_main () at bonobo-main.c:311
#8  0x0000000000403e8e in ?? ()


Comment 8 James Antill 2008-03-24 14:34:03 UTC
 I've just turned off ldap lookups, by default, so this might go away for me now.


Comment 9 Matthew Barnes 2008-03-24 14:38:24 UTC
Thanks for the backtrace, but it's missing debugging symbols and doesn't show
all active threads.  Can I ask you to install evolution-data-server-debuginfo
and, if you happen to see this again, run a "thread apply all bt" command from GDB?

Comment 10 James Antill 2008-03-24 15:47:21 UTC
 Sure, for some reason I didn't think that was threaded ... oh well.

 Also it's worth noting that the ldap is over a vpn, and the vpn had gone down
when the above happened.

 As another point/bug, I killed the evolution-data-server with kill -9 and
evolution itself still didn't recover.


Comment 11 Matthew Barnes 2008-03-24 20:32:14 UTC
(In reply to comment #10)
>  Sure, for some reason I didn't think that was threaded ... oh well.

I wish it wasn't.


>  Also it's worth noting that the ldap is over a vpn, and the vpn had gone down
> when the above happened.
> 
>  As another point/bug, I killed the evolution-data-server with kill -9 and
> evolution itself still didn't recover.

On the E-D-S side, it could have just been waiting for a socket to timeout. 
Adding custom timeouts to connect() calls is, unfortunately, a bit tricky.  I've
seen these hangs myself on those /rare/ *ahem* occasions when our VPN drops out.
 Most of the time Evolution will eventually recover.

"kill -9" may have bypassed whatever mechanism Evolution uses to detect when
E-D-S dies.  In any other app this would just be a matter of listening for a
SIGCHLD signal, but Evolution and E-D-S talk over Bonobo, and who knows what
"kill -9" does to a CORBA server.

Comment 12 Matthew Barnes 2008-07-08 15:17:04 UTC
Any updates on this James?  Still seeing the hangs?

Comment 13 James Antill 2008-07-08 16:39:55 UTC
 I think so, at least I haven't had it hang for ages recently ... but then I had
to have it rebuild all of ~/evolution so that might have helped.



Note You need to log in before you can comment on or make changes to this bug.