Bug 253831 - fast-user-switch-applet bug, consumes 100% cpu
Summary: fast-user-switch-applet bug, consumes 100% cpu
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: fast-user-switch-applet
Version: 7
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Matthias Clasen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-08-22 09:17 UTC by Christian Mandery
Modified: 2007-11-30 22:12 UTC (History)
4 users (show)

Fixed In Version: 2.17.4-5.fc7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-03 21:15:40 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Limit users fetched from getpwent to 200. (543 bytes, text/x-patch)
2007-09-06 09:18 UTC, Christian Mandery
no flags Details
Diff for F7 stable RPM spec to include max users patch. (661 bytes, text/x-patch)
2007-09-06 09:18 UTC, Christian Mandery
no flags Details
New RPM spec diff applied to Fedora 7 RPM (994 bytes, patch)
2007-09-06 09:35 UTC, Christian Mandery
no flags Details | Diff
alternative patch (10.00 KB, patch)
2007-09-17 06:14 UTC, Matthias Clasen
no flags Details | Diff

Description Christian Mandery 2007-08-22 09:17:57 UTC
Description of problem:
For Gnome users fast-user-switch-applet starts using 100% CPU time immediately
after logging in.

Excerpt from ps aux:
d048868  23820 88.6  0.5  45016 21092 ?        R    11:14   0:09
/usr/libexec/fast-user-switch-applet
--oaf-activate-iid=OAFIID:GNOME_FastUserSwitchApplet_Factory --oaf-ior-fd=39

It does not terminate (I've seen some running for hours) and it does not stop
using the whole free CPU time. At the moment the only "fix" is to kill the process.

Version-Release number of selected component (if applicable):
Fedora 7, latest updates, i.e. fast-user-switch-applet-2.17.4-4.fc7

How reproducible:
Login into the Gnome desktop environment.

Comment 1 Helge Deller 2007-09-03 14:32:11 UTC
I think this is rather a high-priority bug.

We use this server as a Terminal-Server for other users. Our network hosts 
more than 40.000 Windows/ADS-Users, against which we run winbindd so that the 
users can log in to the Linux-box.
I assume that the gnome-fast-user-switch-applet collects all users from 
NIS/winbindd (>= 40.000), then tries to sort them, and then wants to prepare 
this list for the drop-down box. This consumes a lot of network bandwidth, 
memory and CPU usage (for sorting). In our tests this applet ran for more than 
3 hours, until we just killed it.

Again, I think this bug is critical, since this behaviour makes it impossible 
for us to activate the Gnome-desktop for our users. That's even the reason, 
why KDE is the default here for now.

There are probably many solutions to fix this bug. Maybe just stop collecting 
users if you see there are more than e.g. 200 users available ?

This


Comment 2 Helge Deller 2007-09-03 21:27:20 UTC
Even worse - there seems to be no way to "disable" the load of the 
fast-user-switch-applet, nor is it possible to de-install this applet without 
removing the gnone-panel as well:

[root@ls3163 gnome]# rpm -e fast-user-switch-applet-2.17.4-4.fc7
error: Failed dependencies:
        fast-user-switch-applet is needed by (installed) 
gnome-panel-2.18.3-1.fc7.i386

Is there an easy way for us to disable the loading of this panel applet in the 
meantime ?

Comment 3 Helge Deller 2007-09-05 14:40:06 UTC
an ltrace attached to this process shows millions of:

gtk_menu_attach(0x8a20028, 0x90dfd38, 97, 98, 62)                                                          
= 1
g_type_check_instance_cast(0x8a20028, 0x8a15c80, 97, 98, 62)                                               
= 0x8a20028
gtk_menu_reorder_child(0x8a20028, 0x90dfd38, 6852, 98, 62)                                                 
= 0
g_type_check_instance_cast(0x94ed938, 0x89c75e0, 6852, 98, 62)   

and millions of:

g_utf8_collate(0x8c9d118, 0x8da4c00, 0x8a07db0, 0xbfc7f84f, 0xbfc7f8bc)                                    
= -18
g_utf8_collate(0x8d88068, 0x8da4c00, 0x8a07db0, 0xbfc7f84f, 0xbfc7f8bc)                                    
= -18
g_utf8_collate(0x8c7f350, 0x8da4c00, 0x8a07db0, 0xbfc7f84f, 0xbfc7f8bc)                                    
= -18
g_utf8_collate(0x8d5f740, 0x8da4c00, 0x8a07db0, 0xbfc7f84f, 0xbfc7f8bc)                                    
= -6
g_utf8_collate(0x8ce9778, 0x8de7b30, 0x8a07db0, 687, 0)                                                    
= -10
g_utf8_collate(0x8dd6db8, 0x8ca1180, 0x8a07db0, 687, 0)                                                    
= -4
g_utf8_collate(0x8dd6db8, 0x8ce9778, 0x8a07db0, 687, 426)                                                  
= -4
g_utf8_collate(0x8ca1180, 0x8ce9778, 0x8a07db0, 687, 426)                                                  
= -10
g_utf8_collate(0x8cfeac0, 0x8d50010, 0x8a07db0, 0x614cf7f, 0x99d10f0)                                      
= -8
g_utf8_collate(0x8c55cd0, 0x8cbdf10, 0x8a07db0, 0x614cf7f, 0x99d10f0)                                      
= -6
g_utf8_collate(0x8c55cd0, 0x8cfeac0, 0x8a07db0, 687, 426)                                                  
= -9
g_utf8_collate(0x8cbdf10, 0x8cfeac0, 0x8a07db0, 687, 426)                                                  
= -3
g_utf8_collate(0x8c55cd0, 0x8dd6db8, 0x8a07db0, 156, 418)                                                  
= -20
g_utf8_collate(0x8cbdf10, 0x8dd6db8, 0x8a07db0, 156, 418)                                                  
= -14
g_utf8_collate(0x8cfeac0, 0x8dd6db8, 0x8a07db0, 156, 418)                                                  
= -11
g_utf8_collate(0x8d50010, 0x8dd6db8, 0x8a07db0, 156, 418)                                                  
= -3
...

HELP !!

Comment 4 Christian Mandery 2007-09-05 16:31:03 UTC
After thinking again over the problem, I think that it is related to sorting the
user list. As stated above, we have a very high number of users (5k - 10k only
on the NIS).

Fetching them using getent passwd works in one second but I suppose
fast-user-switch-applet is trying to sort them with a sort algorithm. But as
every sort algorithm involves comparing two objects to determining their correct
order that means that fast-user-switch-applet must call g_utf8_collate for both
strings to figure out which one comes first.

This would be no problem, even for 10k users, if fast-user-switch-applet would
collate all strings using g_utf8_collate_key and sort the results of this
function but if it calls g_utf8_collate for every string comparision that is
quite slow, especially given the fact that most sort algorithms have O(log(n)).

The GLib reference states in the description for g_utf8_collate:
<< When sorting a large number of strings, it will be significantly faster to
obtain collation keys with g_utf8_collate_key() and compare the keys with
strcmp() when sorting instead of sorting the original strings. >>

So may I suggest one of the following:
- Fix fast-user-switch-applet to call g_utf8_collate_key() once per string as
the GLib documentation recommends.
- fast-user-switch-applet should stop fetching new users when it reaches a given
limit (e.g. 100 users). [workaround]
- fast-user-switch-applet should not sort the user list if the user count
exceeds a given number. [workaround]

Comment 5 Matthias Clasen 2007-09-05 16:46:40 UTC
Thanks for that information, that is going to be helpful

Comment 6 Helge Deller 2007-09-05 17:08:01 UTC
I have another option, which would even be our preferred one (beside speeding 
up the sorting):
- turn fast-user-switch-applet to a stand-alone RPM package with a dependency 
on gnome-panel RPM. That way we could remove that applet completely from our 
terminal server, esp. since this applet does not make sense in such a 
multi/mega-user environment.... [workaround]


Comment 7 Matthias Clasen 2007-09-05 17:47:38 UTC
it already is a separate rpm:

rpm -q fast-user-switch-applet

Comment 8 Nils Philippsen 2007-09-05 19:25:46 UTC
(In reply to comment #7)
> it already is a separate rpm:

But gnome-panel depends on it ;-) (possibly because it's in the default setup),
i.e. you can't remove it without resorting to --nodeps which would suck.




Comment 9 Christian Mandery 2007-09-06 09:17:47 UTC
Fine, I now wrote kind of a workaround that limits the maximum user count
fetched from getpwent to 200. With this patch, fast-user-switch-applet is
finished sorting this list in less than one second here and since 200 users
should be "enough for everyone", this could probably be included upstream.

But as said, this is a workaround. Since fast-user-switch-applet doesn't do the
sorting itself but calls g_slist_sort(), more than one change would be needed.

I think that adding a new field to the user structure that caches the collated
name is the right way. That field (c string) would be initialized with a NULL
pointer for every user and then set only for the first call with this user for
fusa_user_collate (the comparison function for g_slist_sort()) by calling
g_utf8_collate_key. No more call of g_utf8_collate would be needed.

This should give no change in behavior whatsoever but only a very very slight
increase in memory usage while sorting and speedups for everyone that has more
than two users.

I think I am able to implement this if you're telling me, I'm on the right way
and don't like my 200 user limit.

Anyway, I'll attach my 200 users max patch and a diff of the Fedora 7 RPM spec
file, in case you want to include it nevertheless.

Comment 10 Christian Mandery 2007-09-06 09:18:18 UTC
Created attachment 188511 [details]
Limit users fetched from getpwent to 200.

Comment 11 Christian Mandery 2007-09-06 09:18:53 UTC
Created attachment 188521 [details]
Diff for F7 stable RPM spec to include max users patch.

Comment 12 Christian Mandery 2007-09-06 09:35:39 UTC
Created attachment 188551 [details]
New RPM spec diff applied to Fedora 7 RPM

Ooops, forget to diff changelog in my first upload.

Comment 13 Matthias Clasen 2007-09-17 06:14:11 UTC
Created attachment 197011 [details]
alternative patch

Here is the patch that I came up with; please let me know if it works for you.
It does two things:

1) the collate-key optimization you mentioned
2) make the applet respect the Include and IncludeAll gdm config keys

this allows you to configure the list of users to be shown in the same way as
you configure the user list on the gdm screen, in gdmsetup.

Comment 14 Christian Mandery 2007-09-17 09:44:33 UTC
Sorry but your patch does not fix our problem, instead it causes startup
problems with all running Gnome Panel applets: When logging in, seven dialogs
popup that say that a panel applet (e.g. OAFIID:Gnome_SystemTrayApplet) could
not be loaded and asks whether the user wants to remove this applet from his
configuration.

Do I need to upgrade some other package?

Comment 15 Matthias Clasen 2007-09-17 12:52:22 UTC
Thats odd; I don't see how a fusa patch could have that effect.
Anyway, I agree that the panel -> fusa dependency is wrong. 
I'll remove that for F8; I need to investigate if it is safe to do the
same for F7.

Comment 16 Matthias Clasen 2007-09-17 14:10:43 UTC
Turns out that we cannot drop the requires from the panel right now, 
see bug 293261

Have you tried setting up a short include list of users in gdmsetup ?
Does that still not help ?

Comment 17 Christian Mandery 2007-09-18 09:13:15 UTC
Sorry, that is really strange here. Now sometimes the Gnome Panel crashes with
the old fusa (from yum), too. But I think that is unrelated to this bug report
and another issue.

If I include some users with Include= in gdm's custom.conf, fusa works fine in
terms of CPU usage. It does not display any users- although I specified some
with Include- though (empty grey box when I click on it, about 3x3 pixels or
so), but I can live with that.

Comment 18 Christian Mandery 2007-09-19 10:44:56 UTC
Okay, it's me again.

I added
IncludeAll=false
Include=nobody
to GDM's custom.conf and after restarting GDM now everything works like a charm.

Thank you for your patch!

It would be nice if it could be included in F7 stable as soon as possible and
maybe reported back upstream for the Gnome dev team, too.

Comment 19 Matthias Clasen 2007-09-19 13:46:15 UTC
Thanks for the testing.

I already filed an upstream bug. I'll put the patch into F7 soon.

Comment 20 Matthias Clasen 2007-09-19 14:53:26 UTC
Please try fast-user-switch-applet-2.17.4-5.fc7 in updates-testing

Comment 21 Christian Mandery 2007-09-24 09:11:20 UTC
Works for me. Thank you.

Comment 22 Fedora Update System 2007-09-24 18:01:00 UTC
fast-user-switch-applet-2.17.4-5.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.

Comment 23 Fedora Update System 2007-10-03 21:15:38 UTC
fast-user-switch-applet-2.17.4-5.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 24 Spencer Shepard 2007-10-11 23:03:14 UTC
I work in the computer labs at my college, but we still have problems with 100%
CPU usage despite having updated to the latest "-5" version.  We have around 600
users, so perhaps that could be part of it?  Killing the process frees up the
CPU again.

Comment 25 Matthias Clasen 2007-10-12 01:16:44 UTC
What I have done in that update is to make the fast-user-switch-applet respect
the gdm configuration for what users to show in the user list. Have you changed
the gdm configuration to not show all 600 users in the list ?

Comment 26 Helge Deller 2007-10-12 10:16:33 UTC
Spencer, did you "restarted" gdm after installing the latest fast-user-switch-
applet rpm (e.g. /etc/init.d/gdm restart") ? This is necessary afaik.

Comment 27 Spencer Shepard 2007-10-12 20:41:06 UTC
I sent our admin a link to this page, so he will probably reply soon.

I do know that the computers automatically update every night sometime around
4am and reboot themselves.  We have had the latest version of the applet since
whenever it was released as an update, and of course the computers have
restarted several times since then.

I don't think we have done anything special with the configuration, so if
showing all the users is the default, then it is probably showing all 600 users.
 Is /etc/gdm/custom.conf the gdm configuration?  It just looks like a long list
of the same options over and over, but set differently each time.

Comment 28 Spencer Shepard 2007-10-17 02:04:30 UTC
Hey, it looks like we figured it out.

We added "IncludeAll=false" to the gdm configuration to stop it from including
all of the users, and all seems well now.

Thanks for the help!

Spencer


Note You need to log in before you can comment on or make changes to this bug.