Bug 1023758

Summary:

evolution-3.8.5-2.fc19.x86_64 Uses 100% CPU utilisation when interacting with mh folder

Product:

[Fedora] Fedora

Reporter:

J <buggy>

Component:

evolution

Assignee:

Matthew Barnes <mbarnes>

Status:

CLOSED UPSTREAM

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

CC:

lucilanga, mbarnes, mcrha

Target Milestone:

---

Keywords:

Reopened

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2013-11-18 11:42:23 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
BT 1	none
BT 2	none

Description J 2013-10-27 16:28:36 UTC

Description of problem:
Evolution consistently runs to 100% CPU when dealing with claws-mail mh folders.

Version-Release number of selected component (if applicable):
evolution-3.8.5-2.fc19.x86_64

How reproducible:
Very

Steps to Reproduce:
1. have mh folder from claws-mail (~ 512 MB in size)
2. setup mh folder using evolution add new account
3. startup evolution move a message between another account and mh folder

Actual results:
Evolution hangs runs CPU up.

Expected results:
Mail message moved to mh folder

Additional info:
This was not present in Fedora 17 and appears to have only recently become problematic. The only way I can now work with evolution is to disable the mh folder.

For a while I thought I had resolved it by using claws-mail to clean up the folder then restarting evolution but the issue has returned.

I have to evolution --force-shutdown to resolve.

Currently downgrading evolution to see if that resolved the issue.

Comment 1 J 2013-10-27 16:38:46 UTC

evolution-3.8.3-2.fc19.x86_64 appears to not share the issue with handling mh folder and on adding an existing mh folder (after purging the *.ibex.* and *.cmeta files works fine.

Emails can readily be moved between accounts again.

Attempts to remove the above files and add the existing mh folder in evolution-3.8.5-2.fc19.x86_64 result in significant import times and CPU usage followed by instability in the evolution.

Comment 2 J 2013-11-08 08:53:43 UTC

Reattempted installation of 3.8.5-2.fc19.x86_64 had one complete rebuild of the mh folder shortly after. So far working ok. Will continue to monitor.

Is there any logging information I can provide to help?

Comment 3 Milan Crha 2013-11-08 09:29:52 UTC

Thanks for the bug report and update. All the mail background code resides in evolution-data-server, thus I would check rather its versions, than those in evolution. Nonetheless, I do not recall any recent changes in the mh code around 3.8.5 release of both evolution and evolution-data-server.

With respect of logging, if I recall correctly there is no specific logging available for mh mail provider, and the overall logging is too chatty, thus I'd say try without logging. High CPU usage is better to chase by backtraces, to see what evolution does in the time of the high CPU usage - at least as a starting point.

As this works for you now, I'm closing this, but please feel free to reopen or update the bug when you find anything new.

Comment 4 J 2013-11-08 10:34:51 UTC

Sorry to do this. But the actual moving of the bugzilla email from my imap to my mh account triggered this:

top - 10:22:12 up  1:10,  3 users,  load average: 1.11, 1.01, 0.65
Tasks: 203 total,   2 running, 201 sleeping,   0 stopped,   0 zombie
%Cpu0  : 10.6 us,  4.6 sy,  0.0 ni, 83.8 id,  0.0 wa,  0.3 hi,  0.7 si,  0.0 st
%Cpu1  :  6.3 us,  2.7 sy,  0.0 ni, 88.3 id,  2.3 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8176712 total,  4869332 used,  3307380 free,   529668 buffers
KiB Swap: 16779856 total,        0 used, 16779856 free,  2797556 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
22757 removed      20   0 4183880 107084  57400 S 105.7  1.3   4:26.98 evolution
21064 root      20   0  161256  35772  21528 S  10.3  0.4   1:48.13 Xorg
21413 removed      20   0 2040236 302032  72216 S   7.0  3.7   0:57.83 gnome-shell
22921 removed      20   0  676828  25036  14688 S   0.7  0.3   0:00.48 gnome-terminal-
23030 removed      20   0  123660   1680   1148 R   0.3  0.0   0:00.04 top

This is less than 24 hours after updated the evolution rpms. Something must have altered.

Packages altered on downgrade are:
evolution.x86_64 0:3.8.5-2.fc19
evolution-bogofilter.x86_64 0:3.8.5-2.fc19
evolution-perl.x86_64 0:3.8.5-2.fc19

Perhaps it is related to one of the other packages such as bog and perl? I can safely remove perl I believe.

Plan of action:
1) Downgrade
2) Confirm stability of downgrade
3) Update/upgrade
4) Check stability
5) Remove bogofilter
6) Check stability

Comment 5 J 2013-11-08 10:55:25 UTC

Removing and readding the mh folder in Evolution evolution-3.8.3-2.fc19.x86_64

Produces this level of activity:
top - 10:48:27 up  1:36,  4 users,  load average: 2.20, 1.40, 1.17
Tasks: 205 total,   2 running, 203 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.7 us,  1.2 sy,  0.0 ni, 46.0 id, 45.9 wa,  0.1 hi,  0.2 si,  0.0 st
KiB Mem:   8176712 total,  5718336 used,  2458376 free,   208992 buffers
KiB Swap: 16779856 total,        0 used, 16779856 free,  3455680 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
24847 REMOVD    20   0 4211232 130472  55800 S  14.3  1.6   0:50.45 evolution
21064 root      20   0  182228  44284  26088 S   9.6  0.5   5:41.42 Xorg
21413 REMOVD    20   0 2206532 383168  76548 S   5.0  4.7   2:02.92 gnome-shell

Which is significantly less than when evolution is playing up. After the folder was indexed I was then able to move the bugzilla email from imap to mh with no incident.

Example of evolution cmdline output:
evolution
No bp log location saved, using default.
[000:000] Cpu: 6.23.10, x4, 3000Mhz, 7985MB
[000:000] Computer model: Not available
[000:000] Browser XEmbed support present: 1
[000:000] Browser toolkit is Gtk2.
[000:006] Using Gtk2 toolkit
No bp log location saved, using default.
[000:000] Cpu: 6.23.10, x4, 3000Mhz, 7985MB
[000:000] Computer model: Not available
RSS Plugin enabled (evolution 3.8, evolution-rss 0.3.93)
(evolution:24847): Gtk-CRITICAL **: gtk_toggle_button_set_active: assertion `GTK_IS_TOGGLE_BUTTON (toggle_button)' failed
(evolution:24847): GLib-GObject-WARNING **: invalid (NULL) pointer instance
(evolution:24847): GLib-GObject-CRITICAL **: g_signal_connect_data: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed
feed University Directory | Physical Sciences degree feed
feed Academic jobs in North West England
feed (IUCr) Jobs
(evolution:24847): Gtk-CRITICAL **: gtk_entry_set_text: assertion `text != NULL' failed
(evolution:24847): Gtk-CRITICAL **: gtk_entry_set_text: assertion `text != NULL' failed
(evolution:24847): Gtk-CRITICAL **: gtk_entry_set_text: assertion `text != NULL' failed
(evolution:24847): Gtk-CRITICAL **: gtk_entry_set_text: assertion `text != NULL' failed
(evolution:24847): Gtk-CRITICAL **: gtk_entry_set_text: assertion `text != NULL' failed
(evolution:24847): camel-WARNING **: Could not open converter for 'unicode-1-1-utf-7' to 'UTF-8' charset

Comment 6 Milan Crha 2013-11-08 11:06:11 UTC

No problem with reopen, I'd like to figure this out too, rather than blindly closing the bug.

Could you install evolution-3.8.5-2 and evolution-data-server-3.8.5-6 (the later is part of updates testing), and use it for some time, please? These are the latest 3.8.5 versions.

Comment 7 Milan Crha 2013-11-08 11:09:51 UTC

(I'm sorry, I accidentally submitted my comment before actually finishing it)

Could you get updated backtrace, from the updated versions (please make sure the debug info packages will have exactly the same version as the binary packages), to see it with current code line numbers, and if anything changed, please?

Comment 8 J 2013-11-08 11:21:14 UTC

Not a problem.

Plan

1) remove mh folder from evolution
2) clear all ibex/cmate files
3) upgrade evolution including testing repo
4) start evolution run as normal
5) add mh folder back into evolution
6) test for "some time" if problems arise post backtrace

Will update as required.

Thanks for your time and effort.

Comment 9 Milan Crha 2013-11-08 12:56:25 UTC

(In reply to J from comment #8)
> ... add mh folder ...

I have couple questions, to get an overview of the underlying data:
a) does the folder have subfolders?
b) how many messages are in the folder (and each subfolder, if any)

Comment 10 J 2013-11-08 14:41:28 UTC

6 folders with no subfolder below them:

Mail
./Spam      0
./draft     3
./trash     0
./queue     0
./sent:   959
./inbox: 7809

I had initially thought it was related to a subfolder I had created using evolution in the inbox but after its removal and the messages relocated to the inbox (via claws-mail) I found there was no difference to Evolutions behaviour. It seemed to be linked to its update (which also included the bogofilter and perl updates). Perl has now been removed and I will try removing bogo as well if I encounter further CPU hogging as well as the above backtrace. Unfortunately the backtrace requires 1.1GB of files to be downloaded so I really need to wait until night before doing that.

Comment 11 Milan Crha 2013-11-11 08:32:22 UTC

(In reply to J from comment #10)
> Mail
> ./Spam      0
> ./draft     3
> ./trash     0
> ./queue     0
> ./sent:   959
> ./inbox: 7809

It's not that many. I will try to reproduce it here too by the end of this week (maybe today, but I do not want to promise something what i would not accomplish).

> Unfortunately the backtrace requires 1.1GB of files to be
> downloaded so I really need to wait until night before doing that.

Do not install everything, just evolution-data-server and evolution is the most interesting debuginfo in this stage.

Comment 12 J 2013-11-11 09:09:40 UTC

Hmmm I've already uninstalled perl-evolution-plugin.

I must admit so far I'm not seeing any big CPU usage. Famous last words.

Comment 13 Milan Crha 2013-11-11 16:53:55 UTC

I tried to reproduce this, with about 3 folder, each with more that 10K messages in a MH account, and even I see higher CPU usage when I Send/Receive this account, it's nothing significant, it lasts only for a fraction of a second.

I'm wondering whether SQLite's folders.db file doesn't play its role here too. Mine is surely different, because I just filled the account, and I do not use it actively, thus there are basically no changes, but that yours can be different.

The
   ~/.local/share/evolution/mail/<MH account UID>/
folder contains folders.db file. Could you check its size, please? Could you create a backup copy of it and run this command on it:
   $ sqlite3 folders.db "vacuum;"
and see how much the size will differ?

Comment 14 J 2013-11-11 17:22:00 UTC

Hmmm this is going to be difficult my folder has 33 number.number two of which show writes for today.

One shows recent access and is 4mb in size. The others are around 20/12k. Interestingly the trash is 16mb which is odd since all the Waste baskets have been purged?

Is there any easy way to confirm which folder is the account I actually want?

Actually scrub that I think by watching the time stamp on the folder and correlating that with me clicking on the inbox of the mh account I have figured it out.

vacuuming the db
Before:
4.1M	folders.db

After:
3.7M	folders.db

I did not earlier today that evolution was pulling about 25% cpu and reindexing the mh folder and that seemed to happen shortly after I had spawned a new evolution by accessing the calendar in Gnome - not certain if that is related.

Also odd the trash folder is so large.

Comment 15 J 2013-11-11 17:33:13 UTC

Yeap I just triggered evolution to start reindexing the mh folder by running a second instance of evolution spawned from the calendar in Gnome - clock.

The CPU usage is not pushing as hard as it was when I first noticed this issue. As it pretty much made evolution unusable. So it may be related but not the same issue? Or perhaps due to the updated backend server?

Quiting the second instance has frozen both "evolutions" and the cpu continues to run between 10 and 25%.

A fresh evolution can still be spawned and is useable. I can see the folder message index rising. Oddly it is showing them as unread but I previously marked all as read in the inbox folder.

Comment 16 J 2013-11-11 17:36:59 UTC

Looking at the cmdline output seeing lots of:

(evolution:10518): camel-WARNING **: Could not find key entry for word 'please': No such file or directory
(evolution:10518): camel-WARNING **: Block 9fc800: Invalid index or content: index 70 used 0
(evolution:10518): camel-WARNING **: Could not find key entry for word 'subject': No such file or directory
(evolution:10518): camel-WARNING **: Block 9fc800: Invalid index or content: index 70 used 0
(evolution:10518): camel-WARNING **: Could not find key entry for word 'subject': No such file or directory
(evolution:10518): camel-WARNING **: Block 92f400: Invalid index or content: index 8 used 0

and now

(evolution:10518): camel-WARNING **: Block '248832' still referenced
(evolution:10518): camel-WARNING **: Block '243712' still referenced
(evolution:10518): camel-WARNING **: Block '231424' still referenced
(evolution:10518): camel-WARNING **: Block '227328' still referenced
(evolution:10518): camel-WARNING **: Block '226304' still referenced
(evolution:10518): camel-WARNING **: Block '216064' still referenced
(evolution:10518): camel-WARNING **: Block '221184' still referenced

Comment 17 J 2013-11-11 17:48:00 UTC

Sorry to spam you like this.

In the directory containing the folder.db previously vacuumed I now have:

4.4M	folders.db
12K	folders.db-journal

after all that?

Starting evolution up the db-journal file has been removed.

However the database is now:
4.4M	folders.db

Running a vacuum then reduces the folder to:
4.2M	folders.db

Hope this helps.

Comment 18 J 2013-11-12 14:23:09 UTC

Hi

I've taken my laptop out of "sleep mode" this has triggered evolution to rebuild the mh directory. It is definitely using less cpu than the 100% previously somewhere in the order of 10 to 25%.

If the mh 100% cpu usage was linked to rebuilding/reindexing the mh folder then something must have changed to reduce the load since I first reported it. I'll try and generate a backtrace using gdb and evolution. Then trigger the rebuild either via having two evolutions open via the calendar issue reporting above or by coming out of "sleep mode" - opening laptop lid.

Comment 19 J 2013-11-12 17:00:49 UTC

So far my backtrace is looking like:
[New Thread 0x7ffff0ff4700 (LWP 2351)]
..................
[New Thread 0x7fff3b13c700 (LWP 2895)]
[Thread 0x7fff3b13c700 (LWP 2895) exited]


After triggering the mh folder to be reindexed by having two evolutions open at once. One in gdb and the second via accessing the calendar via the clock in Gnome.

It is certainly more usable than before but it is a pain it reindexing the entire mh and marking them as (new messages) when you have previously marked all as read.

I am currently continuing to install debug files. Lots of them!

I will also try going back to the older versions if that would help?

Comment 20 J 2013-11-12 17:26:14 UTC

To be honest I'm not entirely certain what I am looking for as none of the threads appear to trigger the 100% CPU utilisation now. If I bt on the threads 

thread apply all bt full

I get things like:
Thread 20 (Thread 0x7fffa97fa700 (LWP 3449)):
---Type <return> to continue, or q <return> to quit---
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x00000035e4c6f536 in WTF::TCMalloc_PageHeap::scavengerThread (
    this=0x35e4f72580 <WTF::pageheap_memory>)
    at Source/WTF/wtf/FastMalloc.cpp:2763
No locals.
#2  0x00000035e4c6f569 in WTF::TCMalloc_PageHeap::runScavengerThread (
    context=<optimized out>) at Source/WTF/wtf/FastMalloc.cpp:1933
No locals.
#3  0x00000035b9207c53 in start_thread (arg=0x7fffa97fa700)
    at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7fffa97fa700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140736037103360, 
                5704227897636589898, 1, 140736037104064, 140736037103360, 
                231474814368, -5704319154018807478, 5711144096764694858}, 
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, 
            data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
Or

bt full
#0  0x00000035b8aeb7fd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x00000035bc2480f4 in g_main_context_poll (priority=2147483647, n_fds=5, 
    fds=0xb73ba0, timeout=33, context=0x694440) at gmain.c:3995
        poll_func = 0x35bc256d50 <g_poll>
#2  g_main_context_iterate (context=0x694440, block=block@entry=1, 
    dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3696
        max_priority = 2147483647
        timeout = 33
        some_ready = <optimized out>
        nfds = 5
        allocated_nfds = 5
        fds = 0xb73ba0
#3  0x00000035bc24855a in g_main_loop_run (loop=0xee9330) at gmain.c:3895
        __PRETTY_FUNCTION__ = "g_main_loop_run"
#4  0x00000037b478a52d in gtk_main () at gtkmain.c:1156
        loop = 0xee9330
#5  0x00000000004038b9 in main (argc=1, argv=0x7fffffffd5c8) at main.c:707
        shell = 0x836000
        settings = <optimized out>
        error = 0x0

Comment 21 Milan Crha 2013-11-13 10:02:43 UTC

Oh, I did not write it here yet, my fault. Just install debugionfo for evolution-data-server nd evolution, then run evolution, trigger the index rebuild (I agree it should not rebuild it that often, neither lost your unread counts - it makes me wonder whether the corresponding folders.db file is not broken for some reason, I would just move it away and let evolution recreate it from scratch), and then get backtraces for all threads with this command:
   $ gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt

Make sure you'll search for passwords and other private things in the backtrace, before attaching it here, because these can be there (I usually search for "pass" (quotes for clarity only) in the file).

Comment 22 J 2013-11-14 16:28:25 UTC

Created attachment 824032 [details]
BT 1

Attached is a backtrace using instructions posted previously. With folder.db removed. Evolution allowed to rebuild mh directory structure and then closed.

Comment 23 J 2013-11-14 16:31:20 UTC

Created attachment 824033 [details]
BT 2

Attached is a backtrace using instructions posted previously. With newly built folder.db left in place. Evolution allowed to start and shortly after a second evolution spawned from the calendar function via Gnome triggering a rebuild of the mh directory structure and then the calendar was closed and evolution closed after directory finished rebuilding.

Note that the rebuild was significantly quicker in the last two cases than previously and was about 2 mb smaller on disk.

Comment 24 Milan Crha 2013-11-18 11:42:23 UTC

(In reply to J from comment #22)
> BT 1

This one looks clean. There is some activity on your IMAP accounts, but otherwise the other threads are idle. At least they were in time of the backtrace capture.

(In reply to J from comment #23)
> BT 2

Nice, this shows the mh folder refresh.

I moved this upstream as [1], please see it for any further updates. The backtrace you uploaded will help in investigation, thanks for it. If possible, please CC yourself there, in case upstream developers will have additional questions.

[1] https://bugzilla.gnome.org/show_bug.cgi?id=712578

Comment 25 Milan Crha 2013-11-22 14:51:47 UTC

I tried to reproduce, and, of course, it works fine here. The login in the code is pretty much simple, the file is (re-)added to the summary only if it was not added to it previously, or if a reindexing was required. Is it possible that some other software renamed the files, which technically invalidated the evolution's summary, causing all messages look like new? Evolution relies on message names here.

Comment 26 J 2013-11-22 17:00:40 UTC

The only programs I can think of are "evolution" when a second instance is called this seems to trigger the reindexing, gnome's "tracker"/"desktop search", recoll or clamav.

claws-mail access is not correlated with issues since it has never been running or been running when a reindexing has been caused/observed. Where evolution has been. Tracker, recoll and clamav all run in the background but should not alter a file.

I can reproducibly trigger reindexing by accessing evolution from the gnome-calendar when an existing instance of evolution is running in mail view.

It does appear that reindexing of the mh folder now however is significantly less intrusive than it was in previous versions.

I don't know if this could be related but I have the mh account receiving settings set to "NOT" check for new mail. That is to say the check for new mail tick box is not ticked. Could this combined with a second evolution instance trigger some initial start up check?

Thanks for all your work and help on this. It is already many, many times more usable than it was previously.

Comment 27 Milan Crha 2013-11-27 13:33:07 UTC

Thanks for your help with the investigation. I could find how to reproduce this, but I finally managed to reproduce this by invoking Refresh context menu item above the MH folder. This happens only for messages which were added after initial reindexing on the folder, which might be all for your folder, but they are almost none for me. I'm going to investigate further what's going wrong here.

Comment 28 Milan Crha 2013-11-27 15:30:38 UTC

I'm not sure I hit the same issue as you have, but I committed a fix upstream. I built a test package [1] for you, just download your version from [1] and update it on your machine. Then run evolution. It may do the reindex for the first time, but the second+ time should be fine. Before you install the test package, try to reproduce by invoking Refresh popup menu option above an MH folder, for an easier reproducer.

[1] http://koji.fedoraproject.org/koji/taskinfo?taskID=6232099