Bug 498871

Summary: Evolution hangs when left alone for a while
Product: [Fedora] Fedora Reporter: Hans Kristian <hk>
Component: evolutionAssignee: Matthew Barnes <mbarnes>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: beland, bjrosen, eagleton, enpontus, gilboad, hk, mbarnes, mcrha, sbergman27, tmraz
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-02-23 08:22:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
strace of the evolution process after pressing send/receive button after it stopped working.
none
gdb output of evolution process
none
gdb output on evolution-data-server
none
gdm information from evolution process
none
gdb information from gnome-keyring-daemon process
none
Output when I dynamically attach strace to evolution
none
Output when I dynamically attach strace to gnome-keyring-daemon none

Description Hans Kristian 2009-05-04 07:54:45 UTC
Created attachment 342280 [details]
strace of the evolution process after pressing send/receive button after it stopped working.

Description of problem:
Every morning when I come into office, evolution has stopped working. Often almost no new mails have been downloaded from the servers since I left work the day before. (Somewhat unusual since the multiple lists I am on generate a lot of mails). If I press "Send/Receive" button, it pops up the regular window saying with the send/receive progress bars. However it never actually makes any progress.

Pressing the [X] button to close evolution or the send/receive window does nothing, has to be kill -9'ed.

The bottom status bar usually does not list any ongoing send/receive tasks until I manually start one, but sometimes it does. This makes me think it is not directly related to the actual mailserver-communication.


Version-Release number of selected component (if applicable):
evolution-help-2.26.1-2.fc11.noarch
evolution-data-server-2.26.1-1.fc11.x86_64
evolution-2.26.1-2.fc11.x86_64


How reproducible:
Always happens during a night. Sometimes happens during lunch or when I am away from the computer for a while.


Steps to Reproduce:
1. Leave evolution open over night
  
Actual results:
No new mails received, and evolution does not work properly when used afterwards.

Expected results:
New mails should download, evolution should work.

Additional info:
This problem existed in FC10 aswell. At first, I thought it would be solved shortly after FC10 release. Then I thought rawhide would surely have it fixed, but evidently it has not been fixed. So it looks like I might be the only one (or we are few) to have this problem, and that suggests to me that it might be something with my settings or install. But I cannot think of what it might be.

I don't run spamassassin or any other filters.
I have tried disabling most plugins that I dont need (such has calendar and contacts stuff).
Images are not auto shown.
One mail account, using pop3(PASSWORD)/smtp(LOGIN)

Only thing I can think of that might be "special" about me is that I run the RadeonHD driver instead of the default "ati" (Radeon) driver.
xorg-x11-drv-radeonhd-1.2.5-2.8.20090411git.fc11.x86_64

I also run a pretty default gnome without compiz effects. Along with firefox, pidgin, xchat, and a bunch of gterms.

I did not find any other bugreports with the same problems. Unlike most of the others, my evolution never actually crashes. It just hangs.

Any suggestions for things to try are welcome.

Comment 1 Milan Crha 2009-05-04 11:22:26 UTC
Thanks for the bug report. It seems it lost connection to some resource and tries to reconnect in some strange way. Could you install debug info packages for evolution, evolution-data-server, gtkhtml3 and when this happens again, obtain Process ID (PID) for running evolution and evolution-data-server processes (say with: ps -A | grep evo) and run this command for both of them:
   $ gdb --batch --ex "t a a bt" -pid=PID &>bt.log
and attach those two files here, please? It'll show what evolution does in time of a hang, and we should be able to investigate the issue more. Thanks in advance.

Comment 2 Hans Kristian 2009-05-05 10:35:06 UTC
Created attachment 342431 [details]
gdb output of evolution process

Comment 3 Hans Kristian 2009-05-05 10:35:56 UTC
Created attachment 342432 [details]
gdb output on evolution-data-server

Comment 4 Hans Kristian 2009-05-05 10:41:26 UTC
Evoluton had once again stopped working during the night. It stopped at approx 00:30, about 7-8 hours after I stopped using the computer. The computer does not have any sleep stuff enabled, not even screensavers, though the monitors were powered off manually when I left.

Visually I can only see that in the bottom status bar it says "Fetching mail (...)", and it has likely done that since approx midnight when it stopped fetching mail.

For your convenience, I installed a couple more of the debuginfo packages relevant to the stacktraces.

Tell me if there is anything more you want to know, since I am working via remote today I do not have to kill evolution until tomorrow.

Comment 5 Milan Crha 2009-05-05 11:39:25 UTC
Seems like some issue with your gnome keyring:
Thread 1 (Thread 0x7f60712837f0 (LWP 31526)):
#0  0x0000003045c0debb in read () from /lib64/libpthread.so.0
#1  0x000000305a20bdca in gnome_keyring_socket_read_all () from /usr/lib64/libgnome-keyring.so.0
#2  0x000000305a20be57 in gnome_keyring_socket_read_buffer () from /usr/lib64/libgnome-keyring.so.0
#3  0x000000305a206a22 in ?? () from /usr/lib64/libgnome-keyring.so.0
#4  0x000000305a206e87 in gnome_keyring_find_items_sync () from /usr/lib64/libgnome-keyring.so.0
#5  0x00007f6073596d20 in ep_keyring_lookup_passwords (user=0x5d801a0 "hk", server=0x3482a30 "pop3.isphuset.no", protocol=0x30cc2a0 "pop", error=0x7fff7d1d76a8)
    at e-passwords.c:408
#6  0x00007f60735974e2 in ep_get_password_keyring (msg=<value optimized out>) at e-passwords.c:842
#7  ep_get_password (msg=<value optimized out>) at e-passwords.c:989
#8  0x00007f6073596622 in ep_idle_dispatch (data=<value optimized out>) at e-passwords.c:464
#9  0x0000003046c3818e in g_main_dispatch (context=<value optimized out>) at gmain.c:1814


Is your CPU usage at 100%? There is some keyring bug, but it was supposed to be fixed. I do not remember much details, maybe also related to number of keyrings, something like 'default' and 'login' makes some similar things, and deleting one of those, I think the 'default' one fixed it.

But I do not remember exactly. Matt, do you?

Comment 6 Hans Kristian 2009-05-05 11:49:12 UTC
cpu usage is not at 100%, in fact the machine is pretty much idle except for audacity playing music :)

I know nothing about keyring, where should I look?

Comment 7 Matthew Barnes 2009-05-05 14:32:16 UTC
I've seen this stack trace time and again, though I thought the gnome-keyring folks had fixed whatever was causing this.  I need to dig up some bug #'s.

The fault lies with Evolution as much as the keyring, though.  We ask the gnome-keyring-daemon process for account passwords synchronously, -and- in the user interface thread (Thread 1).  So if the daemon stops responding or our connection to it gets dropped for whatever reason, not only do we get stuck waiting forever for a password, we also block the UI from redrawing itself or responding to user input, etc.  It's been on my "to do" list for some time to redesign how that works and make it more fault tolerant.

Hans, do I understand you that correctly that this issue only happens when Evolution is left unattended?  Never when you're actively using it?

I'm curious to see what the gnome-keyring-daemon process is doing when these hangs occur.  Perhaps it's crashing and being respawned automatically, leaving Evolution with a dead connection.

If you could, please try the following procedure and leave it running until the next hang occurs:

   1) Obtain the Process ID for gnome-keyring-daemon.

   2) Attach GDB to it: gdb --pid <process-id-for-gkd>

   3) At the (gdb) prompt, issue a "continue" command.

   4) Wait for Evolution to hang again.

   5) If the daemon has stopped with a SIGSEGV or SIGABRT message, obtain
      a stack trace with the "t a a bt" command.

      -or -

      If the daemon appears to still be running while Evolution is hung,
      press Ctrl-C to halt it and then obtain a stack trace the same way.

Comment 8 Hans Kristian 2009-05-05 16:31:54 UTC
Well, assuming this nights hang can be used as a poor-mans statistics. The probability of evolution crashing would to be approx 1/180 (crashed after approx 15 hours, with 5min mail-checking intervals).

And I have not seen it hang when doing a manual send/receive, but I dont really do that very often. But I remember it has stopped working during the day while I was using the computer previously, but leaving evolution on during the night is 100% accurate; it will hang.

I had a look at my process list, and it does not seem like gnome-keyring-daemon has crashed and/or been respawned. It still has PID 2466, while gdm-password has 2447 and gnome-session has 2481. Had it been respawned, it would not likely been in that range.

Normally I just kill -9 the evolution process, and then restart evolution and it will work. So in other words gnome-keyring-daemon probably did not crash, but somehow it and/or evolution got into a deadlock or something.

So now I tried to kill gnome-keyring-daemon instead of killing evolution, and lo-and-behold; evolution sprang to life again giving the following prompt. "Enter password for default keyring to unlock. The application evolution wants access to the default keyring, but it is locked." I've never seen this prompt before.

My user password was accepted and evolution tells me fetching mail failed during sending password (likely the server didn't want to wait around for 18 hours). Pressing send/receive again works just fine. So Evolution seems to work nicely again without restarting it.

I also see that gnome-keyring-daemon has been respawned with different parameters now, probably because it was re-spawned and it was not a normal startup, again pointing to it not having crashed and respawned.

Before:
 2466 ?        Sl     0:10 /usr/bin/gnome-keyring-daemon --daemonize --login
After:
13849 ?        SLl    0:00 /usr/bin/gnome-keyring-daemon --start --foreground --components=keyring

I've set evolution to check for new emails every 1 min now, and gdb is attached to gnome-keyring-daemon. Hopefully it will hang sometime before I go to bed.

Comment 9 Hans Kristian 2009-05-05 17:05:47 UTC
Well, that was fast..  it hung already.

gdb does not really say much interesting:
[New Thread 0x7f96e78d4910 (LWP 14197)]
[New Thread 0x7f96e6ed3910 (LWP 14198)]
[Thread 0x7f96e78d4910 (LWP 14197) exited]
[Thread 0x7f96e6ed3910 (LWP 14198) exited]
[New Thread 0x7f96e78d4910 (LWP 14204)]
[Thread 0x7f96e78d4910 (LWP 14204) exited]
[New Thread 0x7f96e78d4910 (LWP 14205)]
[Thread 0x7f96e78d4910 (LWP 14205) exited]
[New Thread 0x7f96e78d4910 (LWP 14206)]
[New Thread 0x7f96e6ed3910 (LWP 14207)]
[Thread 0x7f96e78d4910 (LWP 14206) exited]
 (a lot more like that above, but I figured it would likely not be interesting)


ctrl-c and "t a a bt" gives the following:
[Thread 0x7f96e78d4910 (LWP 14206) exited]
^C
Program received signal SIGINT, Interrupt.
0x00000030454d7043 in *__GI___poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
87        int result = INLINE_SYSCALL (poll, 3, CHECK_N (fds, nfds), nfds, timeout);
Current language:  auto; currently minimal
(gdb) t a a bt

Thread 75 (Thread 0x7f96e6ed3910 (LWP 14207)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x0000003045c09075 in _L_lock_949 () from /lib64/libpthread.so.0
#2  0x0000003045c08e98 in __pthread_mutex_lock (mutex=0x11bd4b0) at pthread_mutex_lock.c:61
#3  0x000000000046d787 in gkr_async_end_concurrent () at gkr-async.c:519
#4  0x000000000040e1fb in yield_and_read_all (fd=19, buf=0x7f96e850c230 "", len=<value optimized out>) at gkr-daemon-io.c:130
#5  0x000000000040e2ac in read_packet_with_size (client=0x11d4cf0) at gkr-daemon-io.c:192
#6  0x000000000040e4ac in client_worker_main (user_data=<value optimized out>) at gkr-daemon-io.c:274
#7  0x000000000046e3b8 in async_worker_thread (data=<value optimized out>) at gkr-async.c:269
#8  0x0000003046c616e4 in g_thread_create_proxy (data=0x11d4da0) at gthread.c:635
#9  0x0000003045c0687a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#10 0x00000030454e04cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f96ed5cf790 (LWP 13849)):
#0  0x00000030454d7043 in *__GI___poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x000000000046de5a in async_poll_func (ufds=0x11cc870, nfsd=5, timeout=-1) at gkr-async.c:113
#2  0x0000003046c3b6e2 in g_main_context_poll (n_fds=<value optimized out>, fds=<value optimized out>, priority=<value optimized out>, timeout=<value optimized out>,
    context=<value optimized out>) at gmain.c:2761
#3  g_main_context_iterate (n_fds=<value optimized out>, fds=<value optimized out>, priority=<value optimized out>, timeout=<value optimized out>,
    context=<value optimized out>) at gmain.c:2443
#4  0x0000003046c3bd85 in IA__g_main_loop_run (loop=0x11bd3b0) at gmain.c:2656
#5  0x000000000040ce97 in main (argc=1, argv=0x7ffff55f5fc8) at gkr-daemon.c:765
Current language:  auto; currently asm
Current language:  auto; currently minimal

Comment 10 Matthew Barnes 2009-05-05 17:12:59 UTC
Awesome.  That's valuable data there.  Looks like perhaps the keyring daemon went and got itself deadlocked.  I'll research this further.

What's your exact version of gnome-keyring, btw?  (rpm -q gnome-keyring)

Comment 11 Hans Kristian 2009-05-05 17:26:30 UTC
gnome-keyring-2.26.1-1.fc11.x86_64

Comment 12 Joshua Rosen 2009-05-17 10:43:26 UTC
gnome-keyring-2.22.3-1.fc9.i386
gnome-keyring-2.22.3-1.fc9.x86_64

Comment 13 Steve Bergman 2009-06-08 21:21:35 UTC
I am still seeing this behavior in evolution 2.26, complete with the gnome-keyring-daemon (also 2.26) association as reported by gdb. I'm running Ubuntu 9.04 but happened onto to this bug in your bugzilla and figured I'd mention.

Comment 14 Bug Zapper 2009-06-09 15:06:25 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Steve Bergman 2009-06-10 18:12:48 UTC
I probably should have left more information about my 2.26.1 experience. Keeping in mind that this is Ubuntu and not Fedora... when this happens to my users, gnome-keyring-daemon appears idle. Exact versions are gnome-keyring 2.26.1-0ubuntu1 and Evolution 2.26.1-0ubuntu2.

gdb backtraces and strace output to follow.

Comment 16 Steve Bergman 2009-06-10 18:15:08 UTC
Created attachment 347269 [details]
gdm information from evolution process

Comment 17 Steve Bergman 2009-06-10 18:16:05 UTC
Created attachment 347272 [details]
gdb information from gnome-keyring-daemon process

Comment 18 Steve Bergman 2009-06-10 18:17:07 UTC
Created attachment 347273 [details]
Output when I dynamically attach strace to evolution

Comment 19 Steve Bergman 2009-06-10 18:17:59 UTC
Created attachment 347274 [details]
Output when I dynamically attach strace to gnome-keyring-daemon

Comment 20 Hans Kristian 2009-07-08 07:03:41 UTC
evolution-2.26.3-1.fc11.x86_64
gnome-keyring-2.26.3-1.fc11.x86_64

from updates-testing still hang.

Comment 21 Steve Bergman 2009-07-08 16:30:42 UTC
Since nobody here, on Ubuntu launchpad, or bugs.gnome.org seems to care... it seems like maybe we should consider just dropping gnome-keyring entirely.

Remove gnome-keyring, and evolution functions normally. You don't even lose the ability to remember passwords. The only other notable user of gnome-keyring is network manager. Epiphany does not use it.

Comment 22 Milan Crha 2009-07-10 14:05:20 UTC
*** Bug 510671 has been marked as a duplicate of this bug. ***

Comment 23 Gilboa Davara 2009-11-26 03:41:33 UTC
I wonder if this and #418731 shouldn't be merged.

Adding my voice to the me too crowd.
Fully update F11, x86_64, KDE 4.3.3.

$ rpm -qa | egrep 'evolution|keyring' | sort
evolution-2.26.3-1.fc11.x86_64
evolution-conduits-2.26.3-1.fc11.x86_64
evolution-data-server-2.26.3-2.fc11.x86_64
evolution-data-server-devel-2.26.3-2.fc11.x86_64
evolution-data-server-doc-2.26.3-2.fc11.noarch
evolution-help-2.26.3-1.fc11.noarch
evolution-perl-2.26.3-1.fc11.x86_64
gnome-keyring-2.26.3-1.fc11.x86_64
gnome-keyring-devel-2.26.3-1.fc11.x86_64
gnome-keyring-pam-2.26.3-1.fc11.x86_64
gnome-python2-gnomekeyring-2.26.0-3.fc11.x86_64

- Gilboa

Comment 24 Gilboa Davara 2009-11-26 03:50:59 UTC
Sorry, I meant https://bugzilla.redhat.com/show_bug.cgi?id=354041

I've seen a large number of reports in GNOME BZ about keyring killing evolution. Apparently, we are not alone.
Can I somehow disable g-k-d until this issue is resolved? This problem has been lingering since F9...

- Gilboa

Comment 25 Christopher Beland 2010-02-23 08:22:06 UTC

*** This bug has been marked as a duplicate of bug 354041 ***