Created attachment 949511 [details] Sample 1 of huge number of threads At least a year I have several problems with evolution when I'm doing something in evolution that needs an interaction with the evolution-data-server. The problem is that it happens not all the time. but it looks like that there are scenario's that the frequency of evolution-data-server increase when the network bandwidth is small. My current environment is that I configured 3 Google e-mail addresses via GOA, including contact persons and calendar facilities. These are the scenarios when I faced the problem: 1. When I reply to a mail and it looks like that the evolution server does a lookup to the contact persons; 2. When I create a new mail and it looks like that the evolution server does a lookup to the contacts persons; 3. When I accept an appointment via the mail component; 3. When I create an save an appointment in the calendar component; 4. When I change an appointment in the calendar component; Till today sometimes the evolution-data-server crashed, like bug https://bugzilla.redhat.com/show_bug.cgi?id=1051770 After some investigation I realize that the crashes occurred by the limitation of allowable threads, see: https://bugzilla.gnome.org/show_bug.cgi?id=731554 and https://bugzilla.redhat.com/show_bug.cgi?id=432903 I changed the thread limit following this guide: http://rudametw.github.io/blog/posts/2014.04.10/not-enough-threads.html and set the limit to 8K. The following occurred when I change an appointment and the bandwidth is limited: - The evolution application freezes for 5 minutes. - CPU utilization became very high more than 100% for the evolution-data-server - The count of active threads increased to 250. Most of the threads is created by the evolution-data-server. So I think that many bugs are related to the thread creation by the evolution-data-server when the bandwidth is poor, in case of Google accounts or maybe also other external calendar and contact persons accounts. Version-Release number of selected component (if applicable): How reproducible: First I can not guarantee this happens also anywhere else. Steps to Reproduce: 1. Add more than 1 Google accounts to evolution. 2. Reduce the available bandwidth. 3. Change an appointment in the evolution calendar component. Actual results: - The evolution application freezes for 5 more than minutes. - CPU utilization become very high more than 100% for the evolution-data-server - The count of active threads increased to 250. Most of the threads is created by the evolution-data-server, see attachment. - after 5 minutes evolution is available again and the cpu utilzation is lowered, also the number of active threads. Expected results: The appointment is saved an synchronized with the Google calendar. Additional info:
Created attachment 949515 [details] Another sample of active threads
Add also the relevant components: evolution-3.10.4-4.fc20.x86_64 evolution-data-server-devel-3.10.4-6.fc20.x86_64 evolution-data-server-3.10.4-6.fc20.x86_64
Created attachment 949532 [details] A scenario with more than 2400 threads After saving a calendar item evolution freezes again and the thread count increase to 2400 threads.
Created attachment 949533 [details] Total of threads increased to more than 4550
Created attachment 949534 [details] Extract from the journalctl with evolution errors
Thanks for a bug report. These thread lists are not that useful, they do not show what the application does in those threads. As you mentioned "Reduced available bandwidth", then I guess that they try to resolve an address of a Google server, but that's only a guess. The list of critical warnings is interesting, they surely might not be there. I think you face something similar to bug #1148247. There were other bug reports upstream [1] where users claimed that setting the GMail accounts directly in evolution, instead of in GOA, helped them with some false password prompts. I do not know whether this issue belongs to the same basket, maybe it doesn't. Could you get a backtrace of the running misbehaving process, please? You can get the backtrace with command like this: $ gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt Please check the bt.txt for any private information, like passwords, email address, server addresses,... I usually search for "pass" at least (quotes for clarity only). Also make sure that you'll have installed debuginfo package for evolution-data-server, the same version as the binary package is. That may show what the threads are doing (or waiting for). [1] https://bugzilla.gnome.org/show_bug.cgi?id=728496
(In reply to Milan Crha from comment #6) > $ gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt Oops, of course replace 'pidof evolution' in the above command with the right running executable process ID. (In reply to Bart Ratgers from comment #0) > 2. Reduce the available bandwidth. I forgot to ask, how do you do that? I'd like to be able to reproduce it here, thus I could investigate the root of the cause.
What I have actual done is I update the agenda while a huge dropbox synchronization process was running in the background. The dropbox clients consume almost the complete bandwidth to upload the files from my laptop to the dropbox server I think you can reproduce this by starting a random upload that consume the full bandwidth. I will also try to reproduce it, and start backtrace.
(In reply to Milan Crha from comment #6) > Thanks for a bug report. These thread lists are not that useful, they do not > show what the application does in those threads. I agree that the list doesn't show what the program actual does, but it shows that the number of threads is enormous growing.
Okay. I started again a huge upload, that consumes almost the complete bandwidth. And playing a litle bitby adding and deleting appointments. After some time evolution freezes and a segfault occurred, see journalctl-evolution-1.log I start again evolution and play again a little bit with appointments. Evolution freeze again. And no the following happens: Timestamp 1: ================================================================================= Evolution freezes and the CPU load of evolution grow, see (top): PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8332 bart 20 0 4248320 667760 83900 R 107,7 8,3 6:21.32 evolution 4486 bart 20 0 2710800 231424 20144 S 26,6 2,9 2:14.03 evolution-calen A create with gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-1.txt a backtrace of the evolution process. Timestamp 2: ================================================================================= Evolution is still freezing and the load of the evolution-calendar servers increase including the number of threads: > 500 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4486 bart 20 0 3114880 263368 20144 S 109,9 3,3 3:24.25 evolution-calen 8332 bart 20 0 4315920 689464 83904 R 104,3 8,5 8:28.94 evolution I create the following backtraces: gdb --batch --ex "t a a bt" -pid=4486 &>bt-evolution-calendar-2.txt gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-2.txt Timestamp 3: ================================================================================= Evolution is still freezing and the number of threads is still increasing: > 1500 I create the following backtraces: gdb --batch --ex "t a a bt" -pid=4486 &>bt-evolution-calendar-3.txt gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-3.txt I kill the evolution process.
Created attachment 949869 [details] bt-evolution-1.txt --> Timestamp 1 The result of:gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-1.txt
Created attachment 949870 [details] bt-evolution-2.txt --> Timestamp 2 bt-evolution-1.txt --> Timestamp 2 The result of:gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-2.txt
Created attachment 949873 [details] bt-evolution-calendar-2.txt --> Timestamp 2 The result of:gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-calendar-2.txt
Created attachment 949874 [details] bt-evolution-3.txt --> Timestamp 3 The result of:gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-3.txt
Created attachment 949875 [details] bt-evolution-calendar-3.txt --> Timestamp 3 The result of:gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-3.txt
Just another temporary freeze (approximately 2 minutes) while opening a new E-mail window, after push the button 'new E-mail' PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 543 bart 20 0 4143736 286312 80276 R 112,5 3,5 1:52.94 evolution 4486 bart 20 0 3090896 238912 20028 S 106,2 3,0 8:29.33 evolution-calen Backtrace: ================================================================================= gdb --batch --ex "t a a bt" -pid=4486 &>bt-evolution-calendar-4.txt gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-4.txt
Created attachment 949948 [details] bt-evolution-4.txt --> Timestamp 4 gdb --batch --ex "t a a bt" -pid=8332 &>bt-evolution-4.txt
Created attachment 949954 [details] bt-evolution-calendar-4.txt --> Timestamp 4 gdb --batch --ex "t a a bt" -pid=4486 &>bt-evolution-calendar-4.txt
Thanks for the update. I see in the calendar factory that some CalDAV calendars (I do not know from it how many you have) are starting new views, the same as some file backend (On This Computer calendar) - those are all the threads about. How many of the calendars of each type do you have configured, please? It can partly influence the behaviour. The evolution backtrace doesn't show the reason why it begun the flood, but I'm sure this is fixed for 3.14.0 (will be released in the spring of the next year). The change cannot be "backported" into 3.10 or even 3.12, because it was a huge change in the way of dealing with calendars in the Calendar view(s, including Memos and Tasks).
Hello Milan, Thank you for the quick response and I'm glad to see that a possible fix will be released next year. For your information I have 4 google calendars configured. All those calendars using the same google account. Is there a way that I can investigate the origin of the flood?
(In reply to Bart Ratgers from comment #20) > Is there a way that I can investigate the origin of the flood? I think it's caused by changes in the left tree of the Calendar view, like when you enable/disable the calendar. The current code basically stops all views and runs new, for all enabled calendars. This was causing unnecessary workload on the backend and client side and is fixed in the current development version. As I cannot backport the whole change, I will try to fix at least this particular part, to limit the workload on both sides, for 3.12.x (Fedora 21). Thinking of it, better to deal with it upstream, for better visibility. Please see [1] for any further updates. [1] https://bugzilla.gnome.org/show_bug.cgi?id=739106
Just as a follow-up, I checked the 3.12.7 code and tried to reproduce it, but no luck there, thus I consider this [2] being fixed in 3.12.x. [2] I mean the issue with full rebuilds of all calendar views when anything changes in the source selector - it is not what is happening in 3.12.7.