Description of problem: I'm trying, repeatedly, to run evolution on rawhide; and failing. For the past couple of weeks, with and without SELinux on, on i386 and on x86_64, and using either a new .evolution, or one inherited from either FC-4 or RHEL-4, the symptoms are exactly the same: evo tries to download my imap data, gets about 100MB of network download done, and then goes into a 100% CPU spin performing no network traffic and making no further progress. The GUI remains responsive but the imap server is never accessible and no further network IO is performed. Version-Release number of selected component (if applicable): evolution-2.5.90-2.1 evolution-data-server-1.5.90-2.2 How reproducible: 100% Steps to Reproduce: 1. Run evolution. 2. Wait. And wait. And wait. Actual results: None. Expected results: My email. :-) Additional info: In all cases, I have noticed that the status line has popped up Pinging IMAP server $myserver (...) when the CPU hang has occurred. I believe that the actual timing of this popup is related to the occurrence of the hang, although I cannot prove this. pstack and gdb bt shows both the initial mailbox load and the imap ping threads trying to access the imap stream when the problem occurs. I suspect this is not an accident; we really should not be trying to ping a server when we *know* we are very busy accessing it from another thread. But even if this is the case, the failure mode of the normal evo-data-server thread should be better than this. The exact behaviour seen when the problem manifests will be attached below.
Created attachment 124774 [details] GDB backtrace of all running threads
Created attachment 124775 [details] pstack of all running threads
Created attachment 124776 [details] ps uHx -L output for evo process ps output showing ~90 minutes of accumulated CPU time on the affected thread and everything else idle.
Created attachment 124777 [details] strace output 30-second strace output showing the thread doing nothing but grow its own memory over the interval being watched.
Created attachment 124778 [details] pstack series Series of 5 pstack snapshots of the imap thread, showing that _something_ is happening in there --- it's not being captured in the same place each time --- but it is never getting out of camel_imap_store_summary_full_name(). A dozen pstacks in succession show this near the top of the stack each time.
OK, it's nothing to do with the server ping: disabling it by adding --- evolution-data-server-1.5.91/camel/providers/imap/camel-imap-store.c~ 2006-02-12 23:11:11.000000000 -0500 +++ evolution-data-server-1.5.91/camel/providers/imap/camel-imap-store.c 2006-02-16 15:06:58.000000000 -0500 @@ -1644,6 +1644,8 @@ CamelImapResponse *response; CamelFolder *current_folder; + return; + CAMEL_SERVICE_LOCK (imap_store, connect_lock); if (!camel_imap_store_connected(imap_store, ex)) to the e-d-s build results in the same symptoms but with no ping on the status line or in the pstack output.
Found it! On fetching the server folder list, we call get_folders_sync(), which does: /* We do a LIST followed by LSUB, and merge the results. LSUB may not be a strict subset of LIST for some servers, so we can't use either or separately */ present = g_hash_table_new(folder_hash, folder_eq); for (j=0;j<2;j++) { response = camel_imap_command (imap_store, NULL, ex, "%s \"\" %G", j==1 ? "LSUB" : "LIST", This has two problems. First, doing a LIST when we haven't even asked for a full folder list is hideously expensive if you are running an imap server that serves out of your homedir and you have a lot of files there (like, for example, several exploded kernel trees!) Second, the merging of these lists is O(N^2), as we call: parse_list_response_as_folder_info ->camel_imap_store_summary_add_from_full ->camel_imap_store_summary_full_name and this last routine, called for each folder added, compares the new name to all previous ones by exhaustive search. Whoops. As a simple workaround, I changed the for (j=0;j<2;j++) { in get_folders_sync() to for (j=0;j<2;j++) { to disable the LIST and simply use LSUB to populate the folder list; this works perfectly for accessing my existing subscribed folders and allows me access to my email again. Without this change, I simply cannot open my account.
Created attachment 124793 [details] Workaround to disable imap LIST until it can be handled sanely.
Did you change it to (j=1;j<2;j++) { rather than (j=0;j<2;j++) { ?
Yes. :) The attached patch has it right.
Filed upstream as http://bugzilla.gnome.org/show_bug.cgi?id=331479
Could an interim release be pushed that has this change applied? I am in the same situation (IMAP served out of home directory) and I can't even get a folder listing (though CPU usage did not skyrocket)
Please don't push an interim release with this precise change -- it'll break imap for anyone who views all folders, not only subscribed folders. At least make it conditional on the use_lsub configuration.
Note that there's no real need to be doing this separate LIST and LSUB on any server which supports LISTEXT -- you can just use LIST (SUBSCRIBED) instead.
This bug just got reassigned to me. Can someone give me an update on the status of this in Evolution 2.8?
My mail account got moved to a dedicated server just before I started reusing Evolution, unfortunately.
Resolving this bug as UPSTREAM since the problem has been filed upstream (see comment #11) and there's been no confirmation that the bug still exists in the current Rawhide release. Please refer to the upstream bug report to continue tracking this issue.