Bug 181805

Summary: Evolution hangs during startup while accessing imap server
Product: [Fedora] Fedora Reporter: Stephen Tweedie <sct>
Component: evolutionAssignee: Matthew Barnes <mbarnes>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: dwmw2
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-12-16 23:11:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150221    
Attachments:
Description Flags
GDB backtrace of all running threads
none
pstack of all running threads
none
ps uHx -L output for evo process
none
strace output
none
pstack series
none
Workaround to disable imap LIST until it can be handled sanely. none

Description Stephen Tweedie 2006-02-16 18:56:22 UTC
Description of problem:
I'm trying, repeatedly, to run evolution on rawhide; and failing.  For the past
couple of weeks, with and without SELinux on, on i386 and on x86_64, and using
either a new .evolution, or one inherited from either FC-4 or RHEL-4, the
symptoms are exactly the same: evo tries to download my imap data, gets about
100MB of network download done, and then goes into a 100% CPU spin performing no
network traffic and making no further progress.  The GUI remains responsive but
the imap server is never accessible and no further network IO is performed.

Version-Release number of selected component (if applicable):
evolution-2.5.90-2.1
evolution-data-server-1.5.90-2.2

How reproducible:
100%

Steps to Reproduce:
1. Run evolution.
2. Wait.  And wait.  And wait.
  
Actual results:
None.

Expected results:
My email.  :-)

Additional info:
In all cases, I have noticed that the status line has popped up
Pinging IMAP server $myserver (...)
when the CPU hang has occurred.  I believe that the actual timing of this popup
is related to the occurrence of the hang, although I cannot prove this.

pstack and gdb bt shows both the initial mailbox load and the imap ping threads
trying to access the imap stream when the problem occurs.  I suspect this is not
an accident; we really should not be trying to ping a server when we *know* we
are very busy accessing it from another thread.  But even if this is the case,
the failure mode of the normal evo-data-server thread should be better than this.

The exact behaviour seen when the problem manifests will be attached below.

Comment 1 Stephen Tweedie 2006-02-16 18:58:11 UTC
Created attachment 124774 [details]
GDB backtrace of all running threads

Comment 2 Stephen Tweedie 2006-02-16 18:58:51 UTC
Created attachment 124775 [details]
pstack of all running threads

Comment 3 Stephen Tweedie 2006-02-16 19:02:47 UTC
Created attachment 124776 [details]
ps uHx -L output for evo process

ps output showing ~90 minutes of accumulated CPU time on the affected thread
and everything else idle.

Comment 4 Stephen Tweedie 2006-02-16 19:04:01 UTC
Created attachment 124777 [details]
strace output

30-second strace output showing the thread doing nothing but grow its own
memory over the interval being watched.

Comment 5 Stephen Tweedie 2006-02-16 19:10:17 UTC
Created attachment 124778 [details]
pstack series

Series of 5 pstack snapshots of the imap thread, showing that _something_ is
happening in there --- it's not being captured in the same place each time ---
but it is never getting out of camel_imap_store_summary_full_name().  A dozen
pstacks in succession show this near the top of the stack each time.

Comment 6 Stephen Tweedie 2006-02-16 20:42:59 UTC
OK, it's nothing to do with the server ping: disabling it by adding

--- evolution-data-server-1.5.91/camel/providers/imap/camel-imap-store.c~    
2006-02-12 23:11:11.000000000 -0500
+++ evolution-data-server-1.5.91/camel/providers/imap/camel-imap-store.c     
2006-02-16 15:06:58.000000000 -0500
@@ -1644,6 +1644,8 @@
        CamelImapResponse *response;
        CamelFolder *current_folder;

+       return;
+
        CAMEL_SERVICE_LOCK (imap_store, connect_lock);

        if (!camel_imap_store_connected(imap_store, ex))

to the e-d-s build results in the same symptoms but with no ping on the status
line or in the pstack output.

Comment 7 Stephen Tweedie 2006-02-16 22:51:31 UTC
Found it!  On fetching the server folder list, we call get_folders_sync(), which
does:

	/* We do a LIST followed by LSUB, and merge the results.  LSUB may not be a strict
	   subset of LIST for some servers, so we can't use either or separately */
	present = g_hash_table_new(folder_hash, folder_eq);
	for (j=0;j<2;j++) {
		response = camel_imap_command (imap_store, NULL, ex,
					       "%s \"\" %G", j==1 ? "LSUB" : "LIST",

This has two problems.  First, doing a LIST when we haven't even asked for a
full folder list is hideously expensive if you are running an imap server that
serves out of your homedir and you have a lot of files there (like, for example,
several exploded kernel trees!)

Second, the merging of these lists is O(N^2), as we call:

parse_list_response_as_folder_info
->camel_imap_store_summary_add_from_full
  ->camel_imap_store_summary_full_name

and this last routine, called for each folder added, compares the new name to
all previous ones by exhaustive search.  Whoops.

As a simple workaround, I changed the
	for (j=0;j<2;j++) {
in get_folders_sync() to
	for (j=0;j<2;j++) {
to disable the LIST and simply use LSUB to populate the folder list; this works
perfectly for accessing my existing subscribed folders and allows me access to
my email again.  Without this change, I simply cannot open my account.

Comment 8 Stephen Tweedie 2006-02-16 22:54:26 UTC
Created attachment 124793 [details]
Workaround to disable imap LIST until it can be handled sanely.

Comment 9 Dave Malcolm 2006-02-16 23:05:09 UTC
Did you change it to
 (j=1;j<2;j++) {
rather than 
 (j=0;j<2;j++) {
?


Comment 10 Stephen Tweedie 2006-02-16 23:35:36 UTC
Yes. :)  The attached patch has it right.


Comment 11 Stephen Tweedie 2006-02-16 23:37:32 UTC
Filed upstream as

http://bugzilla.gnome.org/show_bug.cgi?id=331479

Comment 13 Michel Alexandre Salim 2006-02-26 02:12:28 UTC
Could an interim release be pushed that has this change applied? I am in the
same situation (IMAP served out of home directory) and I can't even get a folder
listing (though CPU usage did not skyrocket)

Comment 14 David Woodhouse 2006-02-28 16:26:21 UTC
Please don't push an interim release with this precise change -- it'll break
imap for anyone who views all folders, not only subscribed folders. At least
make it conditional on the use_lsub configuration.

Comment 15 David Woodhouse 2006-03-15 14:36:02 UTC
Note that there's no real need to be doing this separate LIST and LSUB on any
server which supports LISTEXT -- you can just use LIST (SUBSCRIBED) instead.

Comment 16 Matthew Barnes 2006-09-21 19:20:09 UTC
This bug just got reassigned to me.

Can someone give me an update on the status of this in Evolution 2.8?

Comment 17 Michel Alexandre Salim 2006-11-02 02:21:09 UTC
My mail account got moved to a dedicated server just before I started reusing
Evolution, unfortunately. 

Comment 18 Matthew Barnes 2006-12-16 23:11:23 UTC
Resolving this bug as UPSTREAM since the problem has been filed upstream (see
comment #11) and there's been no confirmation that the bug still exists in the
current Rawhide release.

Please refer to the upstream bug report to continue tracking this issue.