Description of problem: The number of child processes that thunderbird has grows over time, until the user reaches the maximum number of processes and no new processes can be forked at all by the user. This prevents the user from unlocking the desktop or logging in over ssh. Version-Release number of selected component (if applicable): thunderbird-10.0.1-2.fc16.x86_64 thunderbird-lightning-1.2.1-1.fc16.x86_64 Steps to Reproduce: 1. start thunderbird 2. read & send email 3. watch the number of child processes increase over time Expected results: Thunderbird should have its child processes exit once they have stopped doing useful work, having no more than a few at a time.
Do you encounter this problem with latest thunderbird-11.0.1 ?
Yes, the issue is still happening with the latest packages: thunderbird-11.0.1-1.fc16.x86_64 thunderbird-lightning-1.3-3.fc16.x86_64
Please look for debugging instructions for application freeze and attach stack trace: https://fedoraproject.org/wiki/Debugging_guidelines_for_Mozilla_products#Application_freeze Check also which addons and plugins aside of lightning you have installed. If any, follow: https://fedoraproject.org/wiki/Debugging_guidelines_for_Mozilla_products#Reporting_addons_and_plugins_issues Thanks.
Jan, The application does not hang, it simply keeps forking child processes until the system is out of processes that can be forked, and results in the entire system getting stuck! I have one extension enabled in thunderbird (lightning) and one plugin (gnome shell integration). I have disabled all the other extensions and plugins.
Btw, "test pilot for thunderbird" was enabled before, as were a bunch of web plugins (which I never enabled! does thunderbird auto-grab those from mozilla?). I have disabled all of those now and will let you know if the problem happens again. If it went away, thunderbird needs to be fixed to not automatically enable all the firefox plugins for itself.
OK, with only lightning and gnome shell integration still enabled, thunderbird is still leaking threads. After running for just a few hours, reading emails and having lightning fetch calendar info, the number of thunderbird threads has reached 35. This is a fairly typical number and I expect it to increase over time (two hours ago there were 26 threads).
What I meant is to do stack trace when you have Thunderbird with so many threads. This should show us which threads are running and what started them. Steps are similar to 'Application freeze' steps, that's why I've mentioned them. Sorry for confusion. Testpilot is okay, Mozilla ship it too. Plugins are shared with firefox (eg. /usr/lib64/mozilla/plugins/), but that shouldn't be the problem.
After almost a week, thunderbird now has 294 threads. Unfortunately the debuginfo for that package no longer seems to be on the server, so I am upgrading to a new thunderbird. I'll try to get backtraces from the extraneous threads this afternoon.
It looks like many of the thunderbird threads are stuck in the nspr code. I wonder if this is a result of the self-signed centificates used on Red Hat's servers? It should be easy to reproduce this by pointing thunderbird + lightning at our internal servers for email and calendar. (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f350b115390 in PR_WaitCondVar (cvar=0x7f34d9131a40, timeout= 4294967295) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:417 #2 0x00007f350b115696 in PR_Wait (mon=0x7f34dd03c4a0, timeout=<optimized out>) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:614 #3 0x00007f350f7c5c75 in Wait (this=0x7f34e23426f8, interval=4294967295) at ../../dist/include/mozilla/ReentrantMonitor.h:121 #4 Wait (interval=4294967295, this=<synthetic pointer>) at ../../dist/include/mozilla/ReentrantMonitor.h:224 #5 nsEventQueue::GetEvent (this=0x7f34e23426f8, mayWait=true, result= 0x7f349c5f9de8) at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsEventQueue.cpp:83 #6 0x00007f350f7c6cd8 in GetEvent (event=<optimized out>, mayWait=<optimized out>, this=<optimized out>) at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsThread.h:113 #7 nsThread::ProcessNextEvent (this=0x7f34e2342690, mayWait=true, result= 0x7f349c5f9e3f) at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsThread.cpp:641 #8 0x00007f350f79b89f in NS_ProcessNextEvent_P (thread=<optimized out>, ---Type <return> to continue, or q <return> to quit--- mayWait=true) at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/build/nsThreadUtils.cpp:245 #9 0x00007f350f7c6634 in nsThread::ThreadFunc (arg=0x7f34e2342690) at /usr/src/debug/thunderbird-11.0.1/comm-release/mozilla/xpcom/threads/nsThread.cpp:292 #10 0x00007f350b11a753 in _pt_root (arg=0x7f34e8821e10) at ../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:187 #11 0x00007f35126a9d90 in start_thread (arg=0x7f349c5fa700) at pthread_create.c:309 #12 0x00007f3510c50f5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
We need backtrace of all threads by 'thread apply all bt full'.
Created attachment 585303 [details] thread apply all bt full output of "thread apply all bt full" after running thunderbird for a few hours
Jan, is there any additional info you need to get this bug going?
Well I don't see anything unusual in backtrace. There are 34 threads in backtrace which for Thunderbird is quite normal. Some of them are working threads, some waits for some proceeding operation. If it went over hundred threads something would be wrong. Please provide us backtrace with a lot of threads (for example after few hours of running).
Created attachment 603844 [details] thread apply all bt full (272 threads)
Looking at these backtraces, I see a fair number of threads stuck in this trace: PR_WaitCondVar PR_Wait Wait Wait nsEventQueue::GetEvent nsThread::ProcessNextEvent NS_ProcessNextEvent_P nsthread::ThreadFunc In other words, there appear to be a lot of idle threads which have not registered themselves as idle with the threadpool manager, using nsThreadPool::PutEvent or similar... Because of that, the thread pool manager keeps spawning more and more threads, instead of reusing available threads. This is the first time I am looking at the Thunderbird code base, so I have no idea how to fix this (or whether my analysis is correct, I may have overlooked something).
Thanks for the backtrace. After some investigation I've filled upstream bug here: https://bugzilla.mozilla.org/show_bug.cgi?id=782645
A question about the upstream bug you filed, why do you think these are mozStorage threads? I see nothing in the backtrace that points at storage...
Hmm, trunk version of Thunderbird uses now named threads, by 'info threads' in gdb one can see purpose of threads. I've found couple of mozStorage threads which have same backtrace as the most frequent thread in your case. These threads should be IMO already finished (according to source code), but there are still some references to them so they kept running. I hope mozilla tells me if it's wrong when finalized mozStorage threads still survived. It still can be something else, because nsEventQueue::GetEvent is quite general. Anyway, besides lightning installed, do you have any other addon, multiple accounts, global indexing enabled/disabled? What about fresh profile, is is still the same? I'm trying to reproduce on my own machine but still no luck.
The only installed addon is lightning. I have two accounts enabled. I see no option for global indexing... I am seeing this issue on two systems, which were configured separately from each other.
This message is a reminder that Fedora 16 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '16'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 16's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 16 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Ditto in recent Fedora 18, i. e. with thunderbird-17.0.5-1.fc18.x86_64
This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.