Bug 116127

Summary:	100% CPU consumption
Product:	[Fedora] Fedora	Reporter:	Konstantin Ryabitsev <icon>
Component:	gnome-vfs2	Assignee:	Alexander Larsson <alexl>
Status:	CLOSED RAWHIDE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	rawhide	CC:	alexl, robin.laing, wtogami
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2004-04-14 18:33:24 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	114961

Description Konstantin Ryabitsev 2004-02-18 15:11:23 UTC

Description of problem:
Right after login I seem to get 100% cpu consumption from gnome-panel.
If I do "killall gnome-panel" it will respawn and act normally, so it
has something to do with startup invocation. My home directory is
mounted over a tcp-nfs exported fileserver.

Version-Release number of selected component (if applicable):
rpm icon@hagrid:[~]$ rpm -q gnome-panel
gnome-panel-0:2.5.3.1-3.i386

How reproducible:
Every now and again. I haven't seen any consistency yet as I've only
been using fc2-t1 for about 3-4 days, but this has happened both
yesterday and today.

Comment 1 Mark McLoughlin 2004-02-23 11:10:18 UTC

I can't reproduce this here :/

Does this happen to you regularily? Could you try strace -p
$(panelpid) and gdb /usr/bin/gnome-panel $(panelpid) to see if you can
get any info that might give me some hint as to what's going on?

Are there any other processes consuming 100% CPU? Does this happen
without an NFS mounted home dir?

Thanks for any more info you can give ..

Comment 2 Konstantin Ryabitsev 2004-02-23 15:41:11 UTC

I haven't seen this happen in the past few days (which included a
gnome-panel update to the rawhide version), so let's hope it went away
with it. I'll close this if I never see this happen again after a week
or so.

Comment 3 Konstantin Ryabitsev 2004-02-24 14:35:22 UTC

Yes, this is still happening as of gnome-panel-0:2.5.3.1-5.i386. I did
the strace -p as you asked and I get about 8MB of entries after 2
seconds of running, which is clearly not normal. I have also noticed
nautilus do the same thing today, and the stuff I see in strace is
very similiar to that of gnome-panel, and it, too, generated an 8MB
file after 2 seconds of running. Both come to their senses after being
killed and respawned, so this is entirely something that happens on login.

It seems that both applications are reissuing the same command over
and over and over again, which is the reason the strace results
compress so well. :)

http://www.phy.duke.edu/~icon/misc/gnome-panel.strace.gz (124K)
http://www.phy.duke.edu/~icon/misc/nautilus.strace.gz (33K)

Let me know if you need anything else.

Comment 4 Mark McLoughlin 2004-02-24 15:30:14 UTC

Konstantin: that's very helpful, and we think we've an idea what might
be causing the problem.

To help us narrow it down, could you do the strace again ... but this
time make sure to strace the panel from the very beginning ... as soon
as it starts looping with the poll/gettimeofday/ioctl/poll... you can
stop it ...

If you need help, let me know

Comment 5 Konstantin Ryabitsev 2004-02-24 16:17:42 UTC

Okay, I can't seem to get gnome-panel to heisenbug for me now that I'm
observing it, but I did manage to catch nautilus in the act. Since
they seem to have the same problem, I'm posting the strace of nau from
the start.

Script used to attach as soon as possible (ran before login):

icon@hagrid:[/xtmp]$ cat nau-stracer
while true
    do GPPID=`ps -e | grep nautilus | grep -v grep | awk '{ print $1 }'`
    if [ -z "$GPPID" ]; then
        continue
    else
        strace -p $GPPID > /xtmp/nautilus.strace 2>&1
        break
    fi
done

The reuslting file is 75M after about 30 seconds of staying logged in,
and here is the top 10000 lines, where you can see the same behavior
start repeating itself soon after startup.

http://www.phy.duke.edu/~icon/misc/nautilus-start.strace.gz (42K)

A little more explanation about our environment:

We have a NIS server and our home directories are automounted during
login. Mount info:

fileserv.phy.duke.edu:/export/home on /home/fileserv type nfs
(rw,rsize=16384,wsize=16384,tcp,addr=152.3.182.x)

It's possible that the reason I can't catch gnome-panel in the act any
more is because strace slows things down enough for the bug not to
manifest itself.

I'll try gnome-panel some more later, maybe I'll succeed.

Comment 6 Mark McLoughlin 2004-02-24 17:59:12 UTC

Thanks for the strace. After a bit of staring I think its a FAM bug:

connect(17, {sa_family=AF_UNIX, path="/tmp/.famfv1M4u"}, 110) = 0

...

read(17, "", 3000)                      = 0
read(17, "", 3000)                      = 0
close(17)       

...

poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8,
events=POLLIN|POLLPRI}, {fd=10, events=POLLIN}, {fd=12,
events=POLLIN|POLLPRI}, {fd\=17, events=POLLIN, revents=POLLNVAL}], 6,
0) = 1
gettimeofday({1077638359, 89109}, NULL) = 0
ioctl(3, FIONREAD, [0])                 = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8,
events=POLLIN|POLLPRI}, {fd=10, events=POLLIN}, {fd=12,
events=POLLIN|POLLPRI}, {fd\=17, events=POLLIN, revents=POLLIN}], 6,
0) = 1
ioctl(3, FIONREAD, [0])                 = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8,
events=POLLIN|POLLPRI}, {fd=10, events=POLLIN}, {fd=12,
events=POLLIN|POLLPRI}, {fd\=17, events=POLLIN, revents=POLLIN}], 6,
0) = 1
ioctl(3, FIONREAD, [0])                 = 0

[and on and on and on ...]

looks like its polling on an closed socket.

Comment 7 Alexander Larsson 2004-02-25 09:10:03 UTC

Veeery strange. I wonder why it was closed.

Comment 8 Alexander Larsson 2004-02-25 09:12:52 UTC

Well, duh, it closed because read returned 0, which means famd died or
something like that.

Comment 9 Alexander Larsson 2004-02-25 09:20:23 UTC

I found and fixed the loop in upstream cvs, but I don't know why the
fam daemon died for you.

Comment 10 Alexander Larsson 2004-02-25 09:21:16 UTC

(gnome-vfs upstream that is)

Comment 11 Alexander Larsson 2004-04-14 18:33:24 UTC

in our packages now.

Comment 12 Mark McLoughlin 2004-04-15 12:17:52 UTC

*** Bug 119443 has been marked as a duplicate of this bug. ***