Bug 181805 - Evolution hangs during startup while accessing imap server
Evolution hangs during startup while accessing imap server
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: evolution (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Matthew Barnes
:
Depends On:
Blocks: FC5Target
  Show dependency treegraph
 
Reported: 2006-02-16 13:56 EST by Stephen Tweedie
Modified: 2007-11-30 17:11 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-12-16 18:11:23 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
GDB backtrace of all running threads (5.21 KB, text/plain)
2006-02-16 13:58 EST, Stephen Tweedie
no flags Details
pstack of all running threads (4.15 KB, text/plain)
2006-02-16 13:58 EST, Stephen Tweedie
no flags Details
ps uHx -L output for evo process (688 bytes, text/plain)
2006-02-16 14:02 EST, Stephen Tweedie
no flags Details
strace output (4.34 KB, text/plain)
2006-02-16 14:04 EST, Stephen Tweedie
no flags Details
pstack series (3.14 KB, text/plain)
2006-02-16 14:10 EST, Stephen Tweedie
no flags Details
Workaround to disable imap LIST until it can be handled sanely. (632 bytes, patch)
2006-02-16 17:54 EST, Stephen Tweedie
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
GNOME Desktop 331479 None None None Never

  None (edit)
Description Stephen Tweedie 2006-02-16 13:56:22 EST
Description of problem:
I'm trying, repeatedly, to run evolution on rawhide; and failing.  For the past
couple of weeks, with and without SELinux on, on i386 and on x86_64, and using
either a new .evolution, or one inherited from either FC-4 or RHEL-4, the
symptoms are exactly the same: evo tries to download my imap data, gets about
100MB of network download done, and then goes into a 100% CPU spin performing no
network traffic and making no further progress.  The GUI remains responsive but
the imap server is never accessible and no further network IO is performed.

Version-Release number of selected component (if applicable):
evolution-2.5.90-2.1
evolution-data-server-1.5.90-2.2

How reproducible:
100%

Steps to Reproduce:
1. Run evolution.
2. Wait.  And wait.  And wait.
  
Actual results:
None.

Expected results:
My email.  :-)

Additional info:
In all cases, I have noticed that the status line has popped up
Pinging IMAP server $myserver (...)
when the CPU hang has occurred.  I believe that the actual timing of this popup
is related to the occurrence of the hang, although I cannot prove this.

pstack and gdb bt shows both the initial mailbox load and the imap ping threads
trying to access the imap stream when the problem occurs.  I suspect this is not
an accident; we really should not be trying to ping a server when we *know* we
are very busy accessing it from another thread.  But even if this is the case,
the failure mode of the normal evo-data-server thread should be better than this.

The exact behaviour seen when the problem manifests will be attached below.
Comment 1 Stephen Tweedie 2006-02-16 13:58:11 EST
Created attachment 124774 [details]
GDB backtrace of all running threads
Comment 2 Stephen Tweedie 2006-02-16 13:58:51 EST
Created attachment 124775 [details]
pstack of all running threads
Comment 3 Stephen Tweedie 2006-02-16 14:02:47 EST
Created attachment 124776 [details]
ps uHx -L output for evo process

ps output showing ~90 minutes of accumulated CPU time on the affected thread
and everything else idle.
Comment 4 Stephen Tweedie 2006-02-16 14:04:01 EST
Created attachment 124777 [details]
strace output

30-second strace output showing the thread doing nothing but grow its own
memory over the interval being watched.
Comment 5 Stephen Tweedie 2006-02-16 14:10:17 EST
Created attachment 124778 [details]
pstack series

Series of 5 pstack snapshots of the imap thread, showing that _something_ is
happening in there --- it's not being captured in the same place each time ---
but it is never getting out of camel_imap_store_summary_full_name().  A dozen
pstacks in succession show this near the top of the stack each time.
Comment 6 Stephen Tweedie 2006-02-16 15:42:59 EST
OK, it's nothing to do with the server ping: disabling it by adding

--- evolution-data-server-1.5.91/camel/providers/imap/camel-imap-store.c~    
2006-02-12 23:11:11.000000000 -0500
+++ evolution-data-server-1.5.91/camel/providers/imap/camel-imap-store.c     
2006-02-16 15:06:58.000000000 -0500
@@ -1644,6 +1644,8 @@
        CamelImapResponse *response;
        CamelFolder *current_folder;

+       return;
+
        CAMEL_SERVICE_LOCK (imap_store, connect_lock);

        if (!camel_imap_store_connected(imap_store, ex))

to the e-d-s build results in the same symptoms but with no ping on the status
line or in the pstack output.
Comment 7 Stephen Tweedie 2006-02-16 17:51:31 EST
Found it!  On fetching the server folder list, we call get_folders_sync(), which
does:

	/* We do a LIST followed by LSUB, and merge the results.  LSUB may not be a strict
	   subset of LIST for some servers, so we can't use either or separately */
	present = g_hash_table_new(folder_hash, folder_eq);
	for (j=0;j<2;j++) {
		response = camel_imap_command (imap_store, NULL, ex,
					       "%s \"\" %G", j==1 ? "LSUB" : "LIST",

This has two problems.  First, doing a LIST when we haven't even asked for a
full folder list is hideously expensive if you are running an imap server that
serves out of your homedir and you have a lot of files there (like, for example,
several exploded kernel trees!)

Second, the merging of these lists is O(N^2), as we call:

parse_list_response_as_folder_info
->camel_imap_store_summary_add_from_full
  ->camel_imap_store_summary_full_name

and this last routine, called for each folder added, compares the new name to
all previous ones by exhaustive search.  Whoops.

As a simple workaround, I changed the
	for (j=0;j<2;j++) {
in get_folders_sync() to
	for (j=0;j<2;j++) {
to disable the LIST and simply use LSUB to populate the folder list; this works
perfectly for accessing my existing subscribed folders and allows me access to
my email again.  Without this change, I simply cannot open my account.
Comment 8 Stephen Tweedie 2006-02-16 17:54:26 EST
Created attachment 124793 [details]
Workaround to disable imap LIST until it can be handled sanely.
Comment 9 Dave Malcolm 2006-02-16 18:05:09 EST
Did you change it to
 (j=1;j<2;j++) {
rather than 
 (j=0;j<2;j++) {
?
Comment 10 Stephen Tweedie 2006-02-16 18:35:36 EST
Yes. :)  The attached patch has it right.
Comment 11 Stephen Tweedie 2006-02-16 18:37:32 EST
Filed upstream as

http://bugzilla.gnome.org/show_bug.cgi?id=331479
Comment 13 Michel Alexandre Salim 2006-02-25 21:12:28 EST
Could an interim release be pushed that has this change applied? I am in the
same situation (IMAP served out of home directory) and I can't even get a folder
listing (though CPU usage did not skyrocket)
Comment 14 David Woodhouse 2006-02-28 11:26:21 EST
Please don't push an interim release with this precise change -- it'll break
imap for anyone who views all folders, not only subscribed folders. At least
make it conditional on the use_lsub configuration.
Comment 15 David Woodhouse 2006-03-15 09:36:02 EST
Note that there's no real need to be doing this separate LIST and LSUB on any
server which supports LISTEXT -- you can just use LIST (SUBSCRIBED) instead.
Comment 16 Matthew Barnes 2006-09-21 15:20:09 EDT
This bug just got reassigned to me.

Can someone give me an update on the status of this in Evolution 2.8?
Comment 17 Michel Alexandre Salim 2006-11-01 21:21:09 EST
My mail account got moved to a dedicated server just before I started reusing
Evolution, unfortunately. 
Comment 18 Matthew Barnes 2006-12-16 18:11:23 EST
Resolving this bug as UPSTREAM since the problem has been filed upstream (see
comment #11) and there's been no confirmation that the bug still exists in the
current Rawhide release.

Please refer to the upstream bug report to continue tracking this issue.

Note You need to log in before you can comment on or make changes to this bug.