805961 – thunderbird forks infinite child processes

Bug 805961 - thunderbird forks infinite child processes

Summary: thunderbird forks infinite child processes

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	thunderbird
Sub Component:
Version:	18
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jan Horak
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-03-22 14:49 UTC by Rik van Riel
Modified:	2014-02-05 22:45 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-02-05 22:45:26 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
thread apply all bt full (126.37 KB, text/plain) 2012-05-17 19:55 UTC, Rik van Riel	no flags	Details
thread apply all bt full (272 threads) (788.17 KB, text/plain) 2012-08-13 03:21 UTC, Rik van Riel	no flags	Details
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Mozilla Foundation	782645	0	None	None	None	2012-08-14 14:24:47 UTC

Description Rik van Riel 2012-03-22 14:49:29 UTC

Description of problem:

The number of child processes that thunderbird has grows over time, until the user reaches the maximum number of processes and no new processes can be forked at all by the user. This prevents the user from unlocking the desktop or logging in over ssh.

Version-Release number of selected component (if applicable):

thunderbird-10.0.1-2.fc16.x86_64
thunderbird-lightning-1.2.1-1.fc16.x86_64

Steps to Reproduce:
1. start thunderbird
2. read & send email
3. watch the number of child processes increase over time
  
Expected results:

Thunderbird should have its child processes exit once they have stopped doing useful work, having no more than a few at a time.

Comment 1 Jan Horak 2012-04-05 14:02:06 UTC

Do you encounter this problem with latest thunderbird-11.0.1 ?

Comment 2 Rik van Riel 2012-04-11 12:48:56 UTC

Yes, the issue is still happening with the latest packages:

thunderbird-11.0.1-1.fc16.x86_64
thunderbird-lightning-1.3-3.fc16.x86_64

Comment 3 Jan Horak 2012-04-12 08:09:52 UTC

Please look for debugging instructions for application freeze and attach stack trace:
https://fedoraproject.org/wiki/Debugging_guidelines_for_Mozilla_products#Application_freeze

Check also which addons and plugins aside of lightning you have installed. If any, follow:

https://fedoraproject.org/wiki/Debugging_guidelines_for_Mozilla_products#Reporting_addons_and_plugins_issues

Thanks.

Comment 4 Rik van Riel 2012-04-12 17:23:05 UTC

Jan,

The application does not hang, it simply keeps forking child processes until the system is out of processes that can be forked, and results in the entire system getting stuck!

I have one extension enabled in thunderbird (lightning) and one plugin (gnome shell integration).

I have disabled all the other extensions and plugins.

Comment 5 Rik van Riel 2012-04-12 17:30:30 UTC

Btw, "test pilot for thunderbird" was enabled before, as were a bunch of web plugins (which I never enabled! does thunderbird auto-grab those from mozilla?). I have disabled all of those now and will let you know if the problem happens again.

If it went away, thunderbird needs to be fixed to not automatically enable all the firefox plugins for itself.

Comment 6 Rik van Riel 2012-04-12 21:39:57 UTC

OK, with only lightning and gnome shell integration still enabled, thunderbird is still leaking threads.

After running for just a few hours, reading emails and having lightning fetch calendar info, the number of thunderbird threads has reached 35. This is a fairly typical number and I expect it to increase over time (two hours ago there were 26 threads).

Comment 7 Jan Horak 2012-04-13 07:02:49 UTC

What I meant is to do stack trace when you have Thunderbird with so many threads. This should show us which threads are running and what started them. Steps are similar to 'Application freeze' steps, that's why I've mentioned them. Sorry for confusion.

Testpilot is okay, Mozilla ship it too. Plugins are shared with firefox (eg. /usr/lib64/mozilla/plugins/), but that shouldn't be the problem.

Comment 8 Rik van Riel 2012-04-18 14:28:24 UTC

After almost a week, thunderbird now has 294 threads.

Unfortunately the debuginfo for that package no longer seems to be on the server, so I am upgrading to a new thunderbird. I'll try to get backtraces from the extraneous threads this afternoon.

Comment 9 Rik van Riel 2012-04-19 18:20:54 UTC

It looks like many of the thunderbird threads are stuck in the nspr code. I wonder if this is a result of the self-signed centificates used on Red Hat's servers?

It should be easy to reproduce this by pointing thunderbird + lightning at our internal servers for email and calendar.

(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f350b115390 in PR_WaitCondVar (cvar=0x7f34d9131a40, timeout=
    4294967295) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:417
#2  0x00007f350b115696 in PR_Wait (mon=0x7f34dd03c4a0, timeout=<optimized out>)
    at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:614
#3  0x00007f350f7c5c75 in Wait (this=0x7f34e23426f8, interval=4294967295)
    at ../../dist/include/mozilla/ReentrantMonitor.h:121
#4  Wait (interval=4294967295, this=<synthetic pointer>)
    at ../../dist/include/mozilla/ReentrantMonitor.h:224
#5  nsEventQueue::GetEvent (this=0x7f34e23426f8, mayWait=true, result=
    0x7f349c5f9de8)
    at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsEventQueue.cpp:83
#6  0x00007f350f7c6cd8 in GetEvent (event=<optimized out>, 
    mayWait=<optimized out>, this=<optimized out>)
    at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsThread.h:113
#7  nsThread::ProcessNextEvent (this=0x7f34e2342690, mayWait=true, result=
    0x7f349c5f9e3f)
    at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsThread.cpp:641
#8  0x00007f350f79b89f in NS_ProcessNextEvent_P (thread=<optimized out>, 
---Type <return> to continue, or q <return> to quit---
    mayWait=true)
    at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/build/nsThreadUtils.cpp:245
#9  0x00007f350f7c6634 in nsThread::ThreadFunc (arg=0x7f34e2342690)
    at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsThread.cpp:292
#10 0x00007f350b11a753 in _pt_root (arg=0x7f34e8821e10)
    at ../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:187
#11 0x00007f35126a9d90 in start_thread (arg=0x7f349c5fa700)
    at pthread_create.c:309
#12 0x00007f3510c50f5d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Comment 10 Jan Horak 2012-05-04 11:28:51 UTC

We need backtrace of all threads by 'thread apply all bt full'.

Comment 11 Rik van Riel 2012-05-17 19:55:56 UTC

Created attachment 585303 [details]
thread apply all bt full

output of "thread apply all bt full" after running thunderbird for a few hours

Comment 12 Rik van Riel 2012-07-05 15:56:08 UTC

Jan, is there any additional info you need to get this bug going?

Comment 13 Jan Horak 2012-08-10 13:57:28 UTC

Well I don't see anything unusual in backtrace. There are 34 threads in backtrace which for Thunderbird is quite normal. Some of them are working threads, some waits for some proceeding operation. If it went over hundred threads something would be wrong. Please provide us backtrace with a lot of threads (for example after few hours of running).

Comment 14 Rik van Riel 2012-08-13 03:21:40 UTC

Created attachment 603844 [details]
thread apply all bt full (272 threads)

Comment 15 Rik van Riel 2012-08-13 03:56:12 UTC

Looking at these backtraces, I see a fair number of threads stuck in this trace:

PR_WaitCondVar
PR_Wait
Wait
Wait
nsEventQueue::GetEvent
nsThread::ProcessNextEvent
NS_ProcessNextEvent_P
nsthread::ThreadFunc

In other words, there appear to be a lot of idle threads which have not registered themselves as idle with the threadpool manager, using nsThreadPool::PutEvent or similar...

Because of that, the thread pool manager keeps spawning more and more threads, instead of reusing available threads.

This is the first time I am looking at the Thunderbird code base, so I have no idea how to fix this (or whether my analysis is correct, I may have overlooked something).

Comment 16 Jan Horak 2012-08-14 14:24:47 UTC

Thanks for the backtrace. After some investigation I've filled upstream bug here:
https://bugzilla.mozilla.org/show_bug.cgi?id=782645

Comment 17 Rik van Riel 2012-08-14 15:18:09 UTC

A question about the upstream bug you filed, why do you think these are mozStorage threads?

I see nothing in the backtrace that points at storage...

Comment 18 Jan Horak 2012-08-15 09:09:52 UTC

Hmm, trunk version of Thunderbird uses now named threads, by 'info threads' in gdb one can see purpose of threads. I've found couple of mozStorage threads which have same backtrace as the most frequent thread in your case. These threads should be IMO already finished (according to source code), but there are still some references to them so they kept running. I hope mozilla tells me if it's wrong when finalized mozStorage threads still survived.

It still can be something else, because nsEventQueue::GetEvent is quite general.

Anyway, besides lightning installed, do you have any other addon, multiple accounts, global indexing enabled/disabled? What about fresh profile, is is still the same? I'm trying to reproduce on my own machine but still no luck.

Comment 19 Rik van Riel 2012-08-15 13:34:30 UTC

The only installed addon is lightning. I have two accounts enabled. I see no option for global indexing...

I am seeing this issue on two systems, which were configured separately from each other.

Comment 20 Fedora End Of Life 2013-01-16 14:03:45 UTC

This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 21 Wolfgang Denk 2013-05-22 13:29:32 UTC

Ditto in recent Fedora 18, i. e. with thunderbird-17.0.5-1.fc18.x86_64

Comment 22 Fedora End Of Life 2013-07-03 22:56:59 UTC

This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 23 Fedora End Of Life 2013-08-01 01:11:36 UTC

Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 24 Fedora End Of Life 2013-12-21 15:01:25 UTC

This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 25 Fedora End Of Life 2014-02-05 22:45:26 UTC

Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.