Red Hat Bugzilla – Bug 204906
gam_server causes an incredibly high number of context switches/sec
Last modified: 2007-11-30 17:11:41 EST
Description of problem:
We're working with RH engineers on a tickless idle kernel, which has the goal of
reducing power and hypervisor loads when the system is idle. This is done by
removing the regular ticking clock when the system is idle, so that in theory
long sleep periods are possible for the hardware (or hypervisor). The kernel
portion of this works great, however when using a gnome desktop there are many
timers going off all the time for userspace, so many that the actual savings are
not so great (about 250 events per second).
By far the biggest offenders of this is gam_server. This daemon seems to have a
30 Hz timer and several 1 Hz timers. While the 30Hz timer is the most serious
problem obviously, even the 1Hz timers form a system issue (since many processes
do 1Hz timers, the aggregate of these is still significant).
On first sight the 30Hz timer has no need to fire continuously and could be made
to fire only when actual file modifications are happening, which would solve the
biggest issue that we currently have with gam_server.
You can check the effectiveness via two methods:
and check the "cs" column. On an idle runlevel 3 this should be between 10 and
20, on an idle (fixed) gnome system this is about 35 to 40 for me. (I hacked the
30Hz timer to a 1Hz timer to get there, as well as about a dozen other fixes to
gnome that are in various bugzillas)
strace -p <pid of fam_server>
the strace output should
1) not have more than poll line per second on idle
2) show only poll delays (last parameter to the call) greater than 1000
I don't really know the gamin design all that well. It was mainly written by
daniel veillard. Daniel, do you know why it needs a 30Hz timer?
Honnestly I don't know ! The only timer I really need and I think I register
is one every second to check for file being watched and which didn't exist.
Well I wrote the code for dnotify, I just scanned the sources and found:
gam_inotify.c:#define SCAN_MISSING_TIME 1000 /* 1 Hz */
gam_inotify.c:#define SCAN_LINKS_TIME 1000 /* 1 Hz */
gam_inotify.c:#define PROCESS_EVENTS_TIME 33 /* 30 Hz */
Novell contributed the inotify back-end, no idea why the need a 30Hz timer
I suggest asking Robert Love or John McCutchan
paphio:~/gamin/server -> grep g_timeout_add *.c
gam_connection.c: ret->eq_source = g_timeout_add (100 /* ms */,
gam_connection.c: * This function can be called periodically by e.g.
gam_inotify.c: g_timeout_add (SCAN_MISSING_TIME, gam_inotify_scan_missing, NULL);
gam_inotify.c: g_timeout_add (SCAN_LINKS_TIME, gam_inotify_scan_links,
NULL);gam_inotify.c: g_timeout_add (PROCESS_EVENTS_TIME,
gam_kqueue.c: poller->timeout_id = g_timeout_add(poller->interval,
(GSourceFunc) gam_kqueue_poller_timeout_cb, poller);
gam_poll_basic.c: g_timeout_add(1000, gam_poll_basic_scan_callback,
gam_server.c: g_timeout_add(1000, (GSourceFunc) gam_connections_check, NULL);
gam_server.c: g_timeout_add(1000, (GSourceFunc) gam_error_check, NULL);
my dnotify back-end only use 1 second timers on normal processing,
the kqueue.c one should affect only BSD's, I guess you hit the inotify
back-end code, and that I have no clue about, sorry !
Hmmm. That stuff was fixed by john in the inotify support in gnome-vfs.
Alex, are you looking at doing similar fixes for gam_server, or have you
asked McClutchan to look into it ?
I pinged John about it.
Created attachment 135450 [details]
Here is a patch that does basically the same thing that was done in gnome-vfs.
Can you try this one, Arjan ?
I also note that gnome-vfs has the missing timeout with 1/4 Hz instead of 1 Hz
Ok this part works for me; however there is a second timer, a 10 Hz one which
also spoils the fun somewhat; it's the gam_connection_eq_flush timer.
(I'd like to leave the missing timeout to be 1 second so that we can switch it
to the g_timeout_add_seconds() api later)
If the gam_connection_eq_flush timer could be 1 Hz instead I'd be a lot happier,
that way it could also use g_timeout_add_seconds()
with the patch, and the gam_connection_eq_flush timer set to 1Hz, all timers are
in seconds, and when I switched gam_server to the g_timeout_add_seconds() API,
there is now exactly one wakeup per second. That'll have to do :)
Created attachment 135463 [details]
Patch to make the 100msec timer an on-demand timer as well
The 10Hz timer can undergo the same treatment as your other patch, just arm it
on demand when there's actual work and stop firing when the queues are empty
Question for Daniel: why is gam_connections_check a 1Hz timer? Can it be a
1/10th Hz timer instead without any bad behavior?
Created attachment 135464 [details]
patch to kill another useless timer
The inotify backend called the initialization of the basic poll backend (which
sets up a 1Hz timer) but then overwrites all methods that could call this
backend, so this 1Hz timer is utterly pointless.
It looks like a bug in the inotify initialization, it appears they ment to call
gam_poll_generic_init not gam_poll_basic_init....
I have ported the gnome-vfs inotify backend over to gamin and will be committing
it to CVS early this week. This should help immensely.
Also compare older patches in
Patches attached to Comment #1 and #4 of
http://bugzilla.gnome.org/show_bug.cgi?id=348974 are not needed after I merge
the new backend. But the rest are applicable to gamin core and still usefull.
I just committed the ported backend. That should solve the inotify side of the
basically the delay for the 1s timer is the time it will take to discover that
a missing watched file was created. On very specific use case, a 10s timer might
just fine. Potentially that could be a setting in the config file, and then set
up in the global config file.
For patch 1 from #9, I also had to change some include files which were missing
from the patch.
Unfortunately I'm completely unable to test this version, all regression tests
seems to be broken if I run "make tests", still I commited it on top of John's
thanks for the patch in #14 and #15. However it seems to have affected
the regression tests, even the basic non-python ones seems to fail completely
now. Did you tried "make tests" before commiting ? Because here, even the
most basic ones fail immedistely, maybe I would need to restart with the new
version on the desktop to really test it, but I'm on the road with just
the laptop and testing is a bit hard for me right now...
Daniel I ran most of the test suite on my system before committing. All the
non-dnotify tests work fine on my system. Just to be explicit the following passed:
1-11.tst in the non-python test suite
dnotify*.py and flood*.py rely on the dnotify code so I can't test them.
Created attachment 135533 [details]
Fix for cvs version
The current version in cvs doesn't give me any events at all without this
patch. With this patch some of the tests pass, but some don't.
+ gam_server_install_kernel_hooks (GAMIN_K_INOTIFY,
That should probably be GAMIN_K_INOTIFY2.
Ok, i just built gamin-0.1.7-5.fc6 in rawhide. It has the new inotify backend
from cvs (with the fix in comment #19 and #20) and the 100msec timer an
on-demand timer from comment #9 (with fixed header).
It seems to work fine, and only wakes up once a second. (Although it still
doesn't pass all tests in the testcase, haven't looked into why.)
I consider this fixed for now.
Alex, if you could just commit the extra patches to CVS, that would be great,
I will try to make a new release when I'm back in a couple of weeks,
0.1.7-6 has even less timers, now only the 4 sec one.
I commited the patches upstream too.
in rawhide it's back to once per second... can we go back to once per 4 seconds?
Looking at the current gam_server code,
the inotify backend initializes the basic poll backend, and
that installs a 1Hz timout that is always running, even if there is
nothing to do (gam_poll_basic_scan_callback). That should be triggered
Matthias is correct, related to this commit:
2006-11-20 Alexander Larsson <email@example.com>
* server/gam_inotify.c: (gam_inotify_init):
Enable the basic polling code for the inotify
backend so e.g. polling on NFS works.
Created attachment 147470 [details]
Patch is untested
Created attachment 149440 [details]
I think this approach is nicer.
* Wed Mar 7 2007 Alexander Larsson <firstname.lastname@example.org> - 0.1.8-4
- Add patch to fix #204906