Red Hat Bugzilla – Full Text Bug Listing
|Summary:||gam_server causes an incredibly high number of context switches/sec|
|Product:||[Fedora] Fedora||Reporter:||Arjan van de Ven <arjan>|
|Component:||gamin||Assignee:||Alexander Larsson <alexl>|
|Status:||CLOSED RAWHIDE||QA Contact:|
|Version:||rawhide||CC:||dwmw2, john, mingo, veillard|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2007-03-07 06:12:14 EST||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Arjan van de Ven 2006-09-01 05:01:24 EDT
Description of problem: We're working with RH engineers on a tickless idle kernel, which has the goal of reducing power and hypervisor loads when the system is idle. This is done by removing the regular ticking clock when the system is idle, so that in theory long sleep periods are possible for the hardware (or hypervisor). The kernel portion of this works great, however when using a gnome desktop there are many timers going off all the time for userspace, so many that the actual savings are not so great (about 250 events per second). By far the biggest offenders of this is gam_server. This daemon seems to have a 30 Hz timer and several 1 Hz timers. While the 30Hz timer is the most serious problem obviously, even the 1Hz timers form a system issue (since many processes do 1Hz timers, the aggregate of these is still significant). On first sight the 30Hz timer has no need to fire continuously and could be made to fire only when actual file modifications are happening, which would solve the biggest issue that we currently have with gam_server. You can check the effectiveness via two methods: 1) vmstat 10 and check the "cs" column. On an idle runlevel 3 this should be between 10 and 20, on an idle (fixed) gnome system this is about 35 to 40 for me. (I hacked the 30Hz timer to a 1Hz timer to get there, as well as about a dozen other fixes to gnome that are in various bugzillas) 2) strace -p <pid of fam_server> the strace output should 1) not have more than poll line per second on idle 2) show only poll delays (last parameter to the call) greater than 1000
Comment 1 Alexander Larsson 2006-09-01 09:03:39 EDT
I don't really know the gamin design all that well. It was mainly written by daniel veillard. Daniel, do you know why it needs a 30Hz timer?
Comment 2 Daniel Veillard 2006-09-01 09:38:10 EDT
Honnestly I don't know ! The only timer I really need and I think I register is one every second to check for file being watched and which didn't exist. Well I wrote the code for dnotify, I just scanned the sources and found: gam_inotify.c:#define SCAN_MISSING_TIME 1000 /* 1 Hz */ gam_inotify.c:#define SCAN_LINKS_TIME 1000 /* 1 Hz */ gam_inotify.c:#define PROCESS_EVENTS_TIME 33 /* 30 Hz */ Novell contributed the inotify back-end, no idea why the need a 30Hz timer I suggest asking Robert Love or John McCutchan paphio:~/gamin/server -> grep g_timeout_add *.c gam_connection.c: ret->eq_source = g_timeout_add (100 /* ms */, gam_connection_eq_flush, ret); gam_connection.c: * This function can be called periodically by e.g. g_timeout_add and gam_inotify.c: g_timeout_add (SCAN_MISSING_TIME, gam_inotify_scan_missing, NULL); gam_inotify.c: g_timeout_add (SCAN_LINKS_TIME, gam_inotify_scan_links, NULL);gam_inotify.c: g_timeout_add (PROCESS_EVENTS_TIME, gam_inotify_process_event_queue, NULL); gam_kqueue.c: poller->timeout_id = g_timeout_add(poller->interval, (GSourceFunc) gam_kqueue_poller_timeout_cb, poller); gam_poll_basic.c: g_timeout_add(1000, gam_poll_basic_scan_callback, NULL);gam_poll_dnotify.c: g_timeout_add(1000, gam_poll_dnotify_scan_callback, NULL); gam_server.c: g_timeout_add(1000, (GSourceFunc) gam_connections_check, NULL); gam_server.c: g_timeout_add(1000, (GSourceFunc) gam_error_check, NULL); paphio:~/gamin/server -> my dnotify back-end only use 1 second timers on normal processing, the kqueue.c one should affect only BSD's, I guess you hit the inotify back-end code, and that I have no clue about, sorry ! Daniel
Comment 3 Alexander Larsson 2006-09-01 10:04:04 EDT
Hmmm. That stuff was fixed by john in the inotify support in gnome-vfs.
Comment 4 Matthias Clasen 2006-09-01 12:39:23 EDT
Alex, are you looking at doing similar fixes for gam_server, or have you asked McClutchan to look into it ?
Comment 5 Alexander Larsson 2006-09-02 03:03:31 EDT
I pinged John about it.
Comment 6 Matthias Clasen 2006-09-03 00:24:58 EDT
Created attachment 135450 [details] a patch Here is a patch that does basically the same thing that was done in gnome-vfs. Can you try this one, Arjan ? I also note that gnome-vfs has the missing timeout with 1/4 Hz instead of 1 Hz
Comment 7 Arjan van de Ven 2006-09-03 03:20:11 EDT
Ok this part works for me; however there is a second timer, a 10 Hz one which also spoils the fun somewhat; it's the gam_connection_eq_flush timer. (I'd like to leave the missing timeout to be 1 second so that we can switch it to the g_timeout_add_seconds() api later) If the gam_connection_eq_flush timer could be 1 Hz instead I'd be a lot happier, that way it could also use g_timeout_add_seconds()
Comment 8 Arjan van de Ven 2006-09-03 03:23:53 EDT
with the patch, and the gam_connection_eq_flush timer set to 1Hz, all timers are in seconds, and when I switched gam_server to the g_timeout_add_seconds() API, there is now exactly one wakeup per second. That'll have to do :)
Comment 9 Arjan van de Ven 2006-09-03 13:49:52 EDT
Created attachment 135463 [details] Patch to make the 100msec timer an on-demand timer as well The 10Hz timer can undergo the same treatment as your other patch, just arm it on demand when there's actual work and stop firing when the queues are empty
Comment 10 Arjan van de Ven 2006-09-03 13:51:07 EDT
Question for Daniel: why is gam_connections_check a 1Hz timer? Can it be a 1/10th Hz timer instead without any bad behavior?
Comment 11 Arjan van de Ven 2006-09-03 14:04:37 EDT
Created attachment 135464 [details] patch to kill another useless timer The inotify backend called the initialization of the basic poll backend (which sets up a 1Hz timer) but then overwrites all methods that could call this backend, so this 1Hz timer is utterly pointless. It looks like a bug in the inotify initialization, it appears they ment to call gam_poll_generic_init not gam_poll_basic_init....
Comment 12 John McCutchan 2006-09-03 23:56:11 EDT
I have ported the gnome-vfs inotify backend over to gamin and will be committing it to CVS early this week. This should help immensely.
Comment 13 Matthias Clasen 2006-09-04 00:09:27 EDT
Also compare older patches in http://bugzilla.gnome.org/show_bug.cgi?id=348974
Comment 14 John McCutchan 2006-09-04 11:33:26 EDT
Patches attached to Comment #1 and #4 of http://bugzilla.gnome.org/show_bug.cgi?id=348974 are not needed after I merge the new backend. But the rest are applicable to gamin core and still usefull.
Comment 15 John McCutchan 2006-09-04 20:55:06 EDT
I just committed the ported backend. That should solve the inotify side of the problem.
Comment 16 Daniel Veillard 2006-09-04 22:38:07 EDT
Hi Arjan, basically the delay for the 1s timer is the time it will take to discover that a missing watched file was created. On very specific use case, a 10s timer might just fine. Potentially that could be a setting in the config file, and then set up in the global config file. For patch 1 from #9, I also had to change some include files which were missing from the patch. Unfortunately I'm completely unable to test this version, all regression tests seems to be broken if I run "make tests", still I commited it on top of John's patch. Daniel
Comment 17 Daniel Veillard 2006-09-04 22:44:06 EDT
Hi John, thanks for the patch in #14 and #15. However it seems to have affected the regression tests, even the basic non-python ones seems to fail completely now. Did you tried "make tests" before commiting ? Because here, even the most basic ones fail immedistely, maybe I would need to restart with the new version on the desktop to really test it, but I'm on the road with just the laptop and testing is a bit hard for me right now... Daniel
Comment 18 John McCutchan 2006-09-04 23:02:27 EDT
Daniel I ran most of the test suite on my system before committing. All the non-dnotify tests work fine on my system. Just to be explicit the following passed: basic.py basic2.py basic3.py basic4.py basic5.py basic6.py bigfile.py level.py multiple.py multiple2.py multiple3.py noexists.py 1-11.tst in the non-python test suite dnotify*.py and flood*.py rely on the dnotify code so I can't test them.
Comment 19 Alexander Larsson 2006-09-05 06:16:46 EDT
Created attachment 135533 [details] Fix for cvs version The current version in cvs doesn't give me any events at all without this patch. With this patch some of the tests pass, but some don't.
Comment 20 Alexander Larsson 2006-09-05 06:34:50 EDT
+ gam_server_install_kernel_hooks (GAMIN_K_INOTIFY, That should probably be GAMIN_K_INOTIFY2.
Comment 21 Alexander Larsson 2006-09-05 06:57:20 EDT
Ok, i just built gamin-0.1.7-5.fc6 in rawhide. It has the new inotify backend from cvs (with the fix in comment #19 and #20) and the 100msec timer an on-demand timer from comment #9 (with fixed header). It seems to work fine, and only wakes up once a second. (Although it still doesn't pass all tests in the testcase, haven't looked into why.) I consider this fixed for now.
Comment 22 Daniel Veillard 2006-09-05 09:47:48 EDT
Alex, if you could just commit the extra patches to CVS, that would be great, I will try to make a new release when I'm back in a couple of weeks, thanks ! Daniel
Comment 23 Alexander Larsson 2006-09-05 10:14:22 EDT
0.1.7-6 has even less timers, now only the 4 sec one.
Comment 24 Alexander Larsson 2006-09-05 10:29:21 EDT
I commited the patches upstream too.
Comment 25 Arjan van de Ven 2007-02-06 08:37:53 EST
in rawhide it's back to once per second... can we go back to once per 4 seconds?
Comment 26 Matthias Clasen 2007-02-06 09:55:21 EST
Looking at the current gam_server code, the inotify backend initializes the basic poll backend, and that installs a 1Hz timout that is always running, even if there is nothing to do (gam_poll_basic_scan_callback). That should be triggered by gam_poll_basic_add_subscription().
Comment 27 John McCutchan 2007-02-06 10:12:58 EST
Matthias is correct, related to this commit: 2006-11-20 Alexander Larsson <email@example.com> * server/gam_inotify.c: (gam_inotify_init): Enable the basic polling code for the inotify backend so e.g. polling on NFS works.
Comment 28 John McCutchan 2007-02-06 10:14:57 EST
Created attachment 147470 [details] gam_poll_basic_conditional_callback.diff Patch is untested
Comment 29 Alexander Larsson 2007-03-07 05:54:51 EST
Created attachment 149440 [details] Nicer version I think this approach is nicer.