Bug 124512

Summary: Evolution hangs, consuming excessive memory
Product: Red Hat Enterprise Linux 4 Reporter: Phil Schaffner <philip.r.schaffner>
Component: gtkhtml3Assignee: Matthew Barnes <mbarnes>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: fortran, jan.iven, jturner, lordmorgul, philip.r.schaffner, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard: RHEL4U3NAK
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-05 20:07:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
full gdb backtrace on looping evolution
none
gdb backtrace
none
gdb backtrace for comment28
none
gdb backtraces while eating memory none

Description Phil Schaffner 2004-05-27 03:57:10 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040510

Description of problem:
Seem to have encountered an evolution memory leak or other excessive
memory/swap usage.  Seemingly "froze" while editing a message, X
became VERY sluggish, evolution windows appeared blank after having
been covered by other windows, did not update or respond.  Top showed:

top - 23:13:59 up  2:27,  8 users,  load average: 1.66, 0.96, 0.45
Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.6% us,  3.0% sy,  0.0% ni,  0.0% id, 91.1% wa,  0.3% hi, 
0.0% si
Mem:    777184k total,   772976k used,     4208k free,    37480k buffers
Swap:  2582792k total,  1063096k used,  1519696k free,    12972k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3786 prs       15   0 1095m 642m  34m D  5.6 84.6   4:21.82 evolution
    9 root      15   0     0    0    0 S  0.7  0.0   0:03.69 kswapd0
 3469 root      15   0 77516 3504  62m S  0.7  0.5   4:03.59 X
 5265 root      17   0  2972  936 1620 R  0.7  0.1   0:00.07 top
 3685 prs       15   0 28536 3096  25m S  0.3  0.4   0:07.86 kdeinit
 3690 prs       16   0 15580 2976  13m S  0.3  0.4   0:10.37 jpilot
    1 root      16   0  2448  248 1316 S  0.0  0.0   0:05.44 init
    2 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    3 root       5 -10     0    0    0 S  0.0  0.0   0:00.11 events/0
    4 root       5 -10     0    0    0 S  0.0  0.0   0:00.00 kblockd/0
    6 root       6 -10     0    0    0 S  0.0  0.0   0:00.01 khelper
    5 root      15   0     0    0    0 S  0.0  0.0   0:00.05 khubd
    7 root      15   0     0    0    0 S  0.0  0.0   0:00.10 pdflush
    8 root      15   0     0    0    0 S  0.0  0.0   0:02.34 pdflush
   10 root      13 -10     0    0    0 S  0.0  0.0   0:00.00 aio/0
  130 root      16   0     0    0    0 S  0.0  0.0   0:00.00 kseriod
  170 root      25   0     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_0
  171 root      15   0     0    0    0 S  0.0  0.0   0:00.00 ahc_dv_0
  187 root      15   0     0    0    0 S  0.0  0.0   0:00.20 kjournald

Disk activity was (naturally) very high.  After about 15 minutes, did
killev and restarted:

top - 23:29:51 up  2:43,  8 users,  load average: 0.00, 0.18, 0.36
Tasks: 122 total,   1 running, 121 sleeping,   0 stopped,   0 zombie
Cpu(s): 17.2% us,  2.3% sy,  0.0% ni, 80.5% id,  0.0% wa,  0.0% hi, 
0.0% si
Mem:    777184k total,   206788k used,   570396k free,    42572k buffers
Swap:  2582792k total,    74144k used,  2508648k free,    54412k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3469 root      16   0 77516 8880  62m S  9.9  1.1   4:19.52 X
 5292 prs       15   0  158m  28m  28m S  5.6  3.8   0:13.24 evolution
 3673 prs       15   0 29628 5300  26m S  0.7  0.7   0:11.04 kdeinit
 5347 root      16   0  2744  936 1620 R  0.7  0.1   0:00.17 top
 3550 prs       15   0 26688 2328  24m S  0.3  0.3   0:01.29 kdeinit
 3664 prs       15   0 26600 4248  24m S  0.3  0.5   0:06.85 kdeinit
 3668 prs       16   0 25708 1980  23m S  0.3  0.3   0:00.21 kdeinit
 3685 prs       15   0 28536 3060  25m S  0.3  0.4   0:10.32 kdeinit
 3687 prs       16   0 29128 4404  26m S  0.3  0.6   0:00.74 kdeinit
 3689 prs       15   0 29216 4176  26m S  0.3  0.5   0:05.20 kdeinit
    1 root      16   0  2448  240 1316 S  0.0  0.0   0:05.44 init
    2 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    3 root       5 -10     0    0    0 S  0.0  0.0   0:00.12 events/0
    4 root       5 -10     0    0    0 S  0.0  0.0   0:00.01 kblockd/0
    6 root       6 -10     0    0    0 S  0.0  0.0   0:00.01 khelper
    5 root      15   0     0    0    0 S  0.0  0.0   0:00.05 khubd
    7 root      15   0     0    0    0 S  0.0  0.0   0:00.10 pdflush
    8 root      15   0     0    0    0 S  0.0  0.0   0:02.35 pdflush
   10 root      13 -10     0    0    0 S  0.0  0.0   0:00.00 aio/0

Working OK after restart.  Recovered message with minor loss of input.
 Have seen similar problems before, but usually recovered after a
short period.


Version-Release number of selected component (if applicable):
evolution-1.5.8-2

How reproducible:
Sometimes

Steps to Reproduce:
1. Run evolution
2. Edit message
3.
    

Actual Results:  Heavy memory/swap usage.

Expected Results:  Run in a reasonable address space.

Additional info:

Comment 1 Phil Schaffner 2004-05-28 19:05:22 UTC
Problem is much worse (as might be expected) on my notebook with 256M
of RAM than on the home system with 768M that was the source of the
original report.  Have not seen it on main workstation system with 3G.

Notebook system is virtually unusable, freezing every few minutes. 
Forgot to note in the text that I have only noticed the problem when
editing a message, although that is covered in "Steps to Reproduce".


Comment 2 Andrew Farris 2004-05-31 09:26:15 UTC
I had this same issue when the first 1.5.7 rpms were built (when Dave
Malcolm took over in building them) and got nowhere in debugging it. 
The problem happened around 10 times.. and seemed to disappear (I
assumed it was a newer gtkhtml3 but could never reproduce it again
after I noticed it was gone).  I posted this bug but as the comment
says.. it doesn't help.

http://bugzilla.ximian.com/show_bug.cgi?id=57930

Comment 3 Phil Schaffner 2004-06-21 19:38:37 UTC
Well, happening right now on the 3GB system with evolution-1.5.9.1-2

top - 15:38:18 up 3 days,  7:05,  6 users,  load average: 2.02, 1.66, 0.85
Tasks: 152 total,   1 running, 150 sleeping,   0 stopped,   1 zombie
Cpu(s):  1.2% us,  2.6% sy,  0.0% ni, 47.5% id, 48.2% wa,  0.0% hi, 
0.5% si
Mem:   3116308k total,  3107116k used,     9192k free,    26852k buffers
Swap:  2096440k total,  1465444k used,   630996k free,   172172k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   41 root      15   0     0    0    0 S  3.6  0.0   0:08.10 kswapd0
24356 root      15   0 89032 6840  76m S  1.3  0.2   6:53.66 X
24566 prs       15   0 30012 4664  26m S  1.0  0.1   0:03.01 kdeinit
 2447 prs       15   0 3789m 2.6g  32m D  1.0 87.8   1:54.01 evolution
12123 prs       16   0  2744  960 1620 R  1.0  0.0   0:00.16 top
 2305 root      15   0 10116 1784 7596 S  0.3  0.1  10:13.37 nmbd
24588 prs       16   0 16704 3832  13m S  0.3  0.1   0:34.67 jpilot
    1 root      16   0  1916  436 1316 S  0.0  0.0   0:01.27 init


Comment 4 Dave Malcolm 2004-09-22 18:44:51 UTC
Can you reproduce the problem with the latest evolution packages?
Evolution 2.0.0 is now in Rawhide.

Comment 5 Ricardo Veguilla 2004-09-23 15:33:25 UTC
I have a similar problem with evolution-2.0.0-2 from Rawhide. I have
several IMAP accounts and the system mail (/var/spool/mail/).
Everything works fine except that when I tried to rad any mail for the
system mail folder, CPU usage goes to 100% and evolution freezes. I'll
try to look for more info and possibly file a separate bug.

Comment 6 Phil Schaffner 2004-09-30 20:02:04 UTC
Well, but the bullet and upgraded my workstation to FC3T2.  On the
second day of use, again encountered this evolution but.  Evolution
(evolution-2.0.0-2) and the system became unresponsive.  Top shows:

top - 14:20:35 up 1 day, 23:31,  9 users,  load average: 0.70, 0.35, 0.18
Tasks: 161 total,   2 running, 159 sleeping,   0 stopped,   0 zombie
Cpu(s): 37.7% us, 37.9% sy,  0.3% ni, 22.4% id,  1.0% wa,  0.0% hi, 
0.7% si
Mem:   3112984k total,  3111048k used,     1936k free,    52228k buffers
Swap:  4128652k total,   293332k used,  3835320k free,    69476k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6792 prs       18   0 2804m 2.6g  33m D 91.7 88.0   5:20.65 evolution
   42 root      16   0     0    0    0 R 53.1  0.0   0:08.61 kswapd0
 2919 root      15   0  9052  764 7576 S  1.3  0.0   9:59.57 nmbd
18985 root      15   0  117m  32m  81m S  1.3  1.1  42:34.51 X

Problem again happened while editing a message.  Killed evolution,
restarted, and things seem to be working again - until the next time
if past experience is a guide.  Tends to bite once a day or more.


Comment 7 Phil Schaffner 2004-09-30 20:05:00 UTC
Can't type today.  Probably obvious, but the above should read: "Well,
bit the bullet and upgraded my workstation to FC3T2.  On the
second day of use, again encountered this evolution bug."

Comment 8 Dave Malcolm 2004-10-01 19:33:42 UTC
Thanks for the reports. Looks like there's a memory leak somewhere; am
I right in thinking it happens every now and then when you start
composing an email?  I've had a go at tracking this down by running
Evolution under valgrind but couldn't reproduce it.

What version of gtkhtml3 are you using?

Does it happen as you are typing in the main composer window, or when
editing the To/CC fields, or does other activity trigger it?  Or is it
a gradual thing that builds up?   If you set up the system monitor to
view memory usage, can you identify when the memory usage increases,
and get an idea of what might be causing it that way?

Have you got autocompletion turned on? (there used to be a memory leak
in the LDAP autocompletion code, though I believe that's been fixed now)

Comment 9 Phil Schaffner 2004-10-04 13:18:47 UTC
Generally seems to happen after working on a message for some time,
int the compose window, not at start of composing; although seemingly,
if it is something to do with the process of composing, doing it
longer gives more chance of encounte3ring the bug.  Seems to buind up
rapidly, not gradually; however, will try monitoring memory usage.

$ rpm -q gtkhtml3
gtkhtml3-3.3.2-3

Autocompletion is on for the Personal folder only.


Comment 10 Andrew Farris 2004-10-19 09:18:08 UTC
The way you are describing the 'triggers' that seem to set off the bug are
identical to the 1.5.7 bug I was experiencing.  It always began while editing an
email, normally after I had it open a period of time, I do not think it was
strictly size of an email, several times I had a new or reply email open with
only a few lines typed in the body.  I'm fairly certain it never happened when
editing any fields other than the body, however it is hard to say when it really
started.

I'm surprised I haven't come across it again.  Adding to CC and I'll see if I
can trigger it again.

Comment 11 Phil Schaffner 2004-11-30 15:32:09 UTC
Have been running evolution-2.0.2-3 (FC3 version + dependencies yum
pulled in from FC3) successfully on FC2 for some weeks.  Seems to be
fixed.  Closing bug.

Comment 12 Dave Malcolm 2004-11-30 19:38:49 UTC
BTW, to what extent were you using calendars?  Were you using the
Exchange connector?

Comment 13 Andrew Farris 2004-11-30 21:24:36 UTC
In my experience with the bug I was never using connector, and hardly
ever used the calendar at all.

Comment 14 Phil Schaffner 2004-12-01 16:09:30 UTC
Not using calendar nor Exchange connector.

Comment 15 Phil Schaffner 2005-02-23 19:49:35 UTC
It's back.  Apparently closed prematurely.  Have had several cases of
excessive CPU/memory usage hanging the system this week.  FC3 on
x86_64 evolution-2.0.2-3 kernel-smp-2.6.10-1.766_FC3 

Comment 16 Phil Schaffner 2005-02-23 19:53:00 UTC
Should add - again, always while editing a message.


Comment 17 Phil Schaffner 2005-06-22 14:24:09 UTC
Bug still present in evolution-2.0.2-16.x86_64 on EL4, kernel-2.6.9-11.ELsmp. 
System (with 4GB RAM) became so sluggish due to swapping that the GUI was
unresponsive.  Had to log in remotely via ssh to kill evolution before I could
regain control.  Just finished upgrading to evolution-2.0.4-4  - rebuilt (along
with requires/depends) from FC3 SRPMS - in an attempt to escape this major
annoyance.

Comment 18 Jan Iven 2005-10-04 08:11:47 UTC
Created attachment 119579 [details]
full gdb backtrace on looping evolution

Comment 19 Jan Iven 2005-10-04 08:12:56 UTC
Me too: evolution sometimes starts consuming memory quickly while editing a new
mail message, typically triggered by typing too fast. gdb shows it is looping
somewhere around html_link_dup() (see attachement).

(breakpoint on malloc) shows

#0  0x00a05b36 in malloc () from /lib/tls/libc.so.6
#1  0x00d47963 in g_malloc (n_bytes=8) at gmem.c:136
#2  0x059dc35f in html_text_free_attrs () from /usr/lib/libgtkhtml-3.1.so.11
#3  0x059dc48d in html_link_dup () from /usr/lib/libgtkhtml-3.1.so.11
#4  0x059ca3cb in html_object_copy () from /usr/lib/libgtkhtml-3.1.so.11
#5  0x059ca403 in html_object_dup () from /usr/lib/libgtkhtml-3.1.so.11
#6  0x059dde25 in html_text_convert_nbsp () from /usr/lib/libgtkhtml-3.1.so.11
#7  0x059c9d72 in html_object_split () from /usr/lib/libgtkhtml-3.1.so.11
#8  0x059a70fd in html_engine_reset_blinking_cursor ()

evo keeps coming back to the breakpoint with this backtrace. So it looks like
something is allocating 8bytes in a quick loop.
Versions:
gtkhtml3-3.3.2-3
evolution-2.0.2-16.3


Comment 20 Dave Malcolm 2005-10-04 16:28:41 UTC
Thanks for the updated information, and for these backtraces: it looks like they
isolate a cause (the cause?) of the problem.

I've changed the "Product" field from Fedora Core to Red Hat Enterprise Linux
(although it affects FC3), and I've marked this bug for consideration in a
future update; I'm keen to get this one fixed (in both EL4 and FC3).

Comment 24 Dave Malcolm 2005-11-17 02:04:42 UTC
Moving from "evolution" to "gtkhtml3" as I'm fairly sure this is a bug in the
latter package.

A couple of thoughts based on Comment 19: 
(i) Looks like the cursor flash might be introducing a time-dependent element to
this bug, which is why it's been so hard to come up with a reliable way of
reproducing it.  
(ii) Comment 19 also suggests you have to be typing fast to triggger the problem.  

I've tried patching gtkhtml3 to greatly speed up the cursor flash and to add
various debug instrumentation, and then tried various torture tests on the code,
but I haven't come up with a good reproducer yet.  

If anyone can come up with more backtraces, please attach them to this bug,
specifying the version of evolution and gtkhtml3 in use.  If possible, please
generate them using "t a a bt" in gdb, and try to get all of the output.

What GTK input methods are you using when you seen the bug? (if you right-click
in the composer have a look in the context menu's Input Methods submenu: are you
using Default?)

I'm trying typing quickly and slowly, pasting in large and small HTML fragments
(especially those containing nbsp entities and URLs), and generally abusing the
email composer.

Further suggestions/hints/hunches on ways to get the bug to manifest reliably
would be most welcome; I'm continuing the investigation, I'll try some more
angles on this tomorrow.

Thanks.


Comment 26 Jan Iven 2006-01-09 09:54:42 UTC
I have not been able to reproduce the memory leak with 
gtkhtml3-3.3.2-6.EL
evolution-2.0.2-22

Possibly related to your gtkhtml changes?:
after cut&paste of a largish HTML chunk (30 paragraphs, 2888 words, 19725 bytes
of Lorem Ipsum, from http://www.lipsum.com), I noticed that typing more text
into the HTML block became rather sluggish, with X taking 50% of CPU. The cursor
blinked frantically and typing 'faster' than the output made it to the screen
became rather easy (so after stopping to hit keys, a few lines of text would
still appear by themselves). If this does not sound related, please disregard.


Comment 27 Jan Iven 2006-01-25 16:53:12 UTC
Created attachment 123679 [details]
gdb backtrace

backtrace of gdb attached to rampant "evolution" process

Comment 28 Jan Iven 2006-02-06 09:28:06 UTC
Created attachment 124247 [details]
gdb backtrace for comment28

Comment 29 Jan Iven 2006-02-06 09:29:41 UTC
One more memleak occurence, again while quickly typing a reply.

 PID TTY      STAT   TIME  MAJFL   TRS   DRS  RSS %MEM COMMAND
4070 ?        Tsl    1:42   2295   134 671237 169224 33.4 evolution-2.0
--sm-config-prefix /evolution-2.0-Ff3xdu/ --sm-client-id
117f000001000113898709100000037100020 --screen 0

gdb backtrace attached, see above.

I am using the "Default" input method.

Comment 31 Jan Iven 2006-03-22 10:30:01 UTC
Created attachment 126463 [details]
gdb backtraces while eating memory

Occured while trying to delete (mark with mouse, hit "Del") a snippet of HTML
previously cut&paste from a different mail.

Comment 32 Jan Iven 2006-03-22 10:32:43 UTC
Still happens with RHEL4U3:
evolution-2.0.2-27
gtkhtml3-3.3.2-6.EL

Could somebody please change the Status? NEEDINFO_REPORTER no longer applies, I
don't know what else to report..

Comment 36 Phil Schaffner 2006-08-22 14:49:17 UTC
Neither do I.  Tried changing status to "NEEEDINFO/ASSIGNEE" but was prevented
by BZ despite being the reporter.  Still experiencing the bug with some
regularity.  Some meaningful change in status or indication of attention to the
problem would be nice.   Maybe in RHEL4U5?

Comment 37 RHEL Program Management 2006-09-05 19:57:54 UTC
The component this request has been filed against is not planned for inclusion
in the next update. The decision is based on weighting the priority and number
of requests for a component as well as the impact on the Red Hat Enterprise
Linux user-base: other components are considered having higher priority and the
number of changes we intend to include in update cycles is limited.

Comment 38 RHEL Program Management 2006-09-05 20:07:05 UTC
Product Management has reviewed and declined this request.  You may appeal this
decision by reopening this request.