211717 – nautilus uses 100% CPU during/after large rsync/ssh retrieve

Bug 211717 - nautilus uses 100% CPU during/after large rsync/ssh retrieve

Summary: nautilus uses 100% CPU during/after large rsync/ssh retrieve

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	nautilus
Sub Component:
Version:	6
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Tomáš Bžatek
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-10-21 03:01 UTC by David Timms
Modified:	2018-04-11 08:32 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-09-02 15:44:20 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
screen capture of gkrellm and top while this bug is occuring (146.86 KB, image/png) 2006-10-21 03:01 UTC, David Timms	no flags	Details
strace of nautilus while in this state. (8.60 KB, text/plain) 2006-10-21 03:58 UTC, David Timms	no flags	Details
a coupla runs of gdb) bt (56.46 KB, text/plain) 2006-10-23 17:02 UTC, David Timms	no flags	Details
View All

Description David Timms 2006-10-21 03:01:56 UTC

Description of problem:
From at least 60GB into a 220GB rsync over ssh retrieve, one CPU core is being
used at 99%-100%. This stays this way many hours after the big rsync has completed.

Version-Release number of selected component (if applicable):
# rpm -qa --queryformat "%{NAME}#%{ARCH}#%{VERSION}#%{RELEASE}\n"|grep -E
'kernel|gamin|nautilus|openssh|rsync'|sort
gamin#i386#0.1.7#7.fc6
gamin#x86_64#0.1.7#7.fc6
kernel#x86_64#2.6.17#1.2630.fc6
kernel#x86_64#2.6.18#1.2798.fc6
nautilus-cd-burner#i386#2.16.0#3.fc6
nautilus-cd-burner#x86_64#2.16.0#3.fc6
nautilus-extensions#i386#2.16.0#5.fc6
nautilus-extensions#x86_64#2.16.0#5.fc6
nautilus-sendto-bluetooth#x86_64#0.7#5.fc6
nautilus-sendto#x86_64#0.7#5.fc6
nautilus#x86_64#2.16.0#5.fc6
openssh-askpass#x86_64#4.3p2#10
openssh-clients#x86_64#4.3p2#10
openssh-server#x86_64#4.3p2#10
openssh#x86_64#4.3p2#10
rsync#x86_64#2.6.8#3.1
# uname -a
Linux mirror-offsite 2.6.18-1.2798.fc6 #1 SMP Mon Oct 16 14:39:22 EDT 2006
x86_64 x86_64 x86_64 GNU/Linux

How reproducible:   Only seen once.

Steps to Reproduce:
1. using vnc to the actual machine.
2. start a script: essentially
rsync --bwlimit=10000000 --progress --stats -e 'ssh -p 30722 -i
/root/sshkey-mirror-offsite' -az
--exclude-from="/root/rsync-scripts/rsync-offsite-full-exclude.lst"  server:/
/home/mirror/server/>>$logfilename
3. Wait for the bwlimited dl to complete {first time sync}
4. tail -f $logfilename
5. gkrellm open viewing cpu+disk+network activity
6. top running.
7. nautilus  {always browse} using the 
  
Actual results:
When checked 60GB into the copy, noted that one core of the AMD AM2 dual-core
processor is consuming 100% CPU. The attachment contains a screenshot of the
gkrellm + top many hours after the rsync has completed.

Expected results:
Since the process is bw limited, and the network speed is much slower than both
the hard disk and CPU speeds, CPU should be rather unutilized during the
process, and should be completetly idle after comletion.

Additional info:
a. While web searching for bug, noticed that occasionally the CPU use jumps to
the other processor.
b. The machine was a FC6T3 install the previous day onto a software raid 1 on
750GB sata drives. The raid build for some of the volumes was continuing when
this rsync was started. As of writing this md2_raid1 process had completed,
however, it had taken: {sorted by cumulative cpu use}

top - 12:48:38 up 1 day,  3:18,  4 users,  load average: 1.02, 1.06, 1.07
Tasks: 143 total,   2 running, 141 sleeping,   0 stopped,   0 zombie
Cpu0  : 17.2%us, 20.3%sy,  0.1%ni, 53.8%id,  7.5%wa,  0.3%hi,  0.7%si,  0.0%st
Cpu1  : 22.8%us, 26.6%sy,  0.1%ni, 44.7%id,  2.3%wa,  0.0%hi,  3.5%si,  0.0%st
Mem:   1024916k total,   986216k used,    38700k free,   336432k buffers
Swap:  8385880k total,      184k used,  8385696k free,   201180k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 3013 root      25   0  420m  31m  20m R  100  3.2   1215:32 nautilus           
 3494 root      10  -5     0    0    0 S    0  0.0   6:10.44 kjournald          
  416 root      10  -5     0    0    0 S    0  0.0   4:27.62 md2_raid1          
 2950 root      15   0 54868  22m 4120 S    0  2.3   3:51.37 Xvnc               
  225 root      15   0     0    0    0 S    0  0.0   2:11.83 pdflush            
  226 root      10  -5     0    0    0 S    0  0.0   2:10.89 kswapd0            
  224 root      15   0     0    0    0 S    0  0.0   2:07.82 pdflush            
c. The machine has vmware-server and server-console installed, and configured
for the running kernel, however, no virt machines have been created.
d. Trying to start a File Browser from the menu does not start it.

Possibly related:
bug 207836 : Nautilus taking 70% processor time after a large copy
  however, nautilus was not used to initiate the copy (started the rsync script
from a gnome-terminal).

bug 187970 : gamin is crazy busy when idle.
  however, gamin is not showing in top, but is running in ps:
root      3082  0.0  0.1  12988  1236 ?        S    Oct20   0:00
/usr/libexec/gam_server

Comment 1 David Timms 2006-10-21 03:01:57 UTC

Created attachment 139044 [details]
screen capture of gkrellm and top while this bug is occuring

Comment 2 David Timms 2006-10-21 03:58:03 UTC

Created attachment 139046 [details]
strace of nautilus while in this state.

# ps aux|grep naut
root	  3013 75.5  3.1 430780 32616 ?        Rs   Oct20 1239:35 nautilus
--no-default-window --sm-client-id default3
root	  6566	0.0  0.6 254520  6192 ?        S    12:18   0:00 nautilus
--no-desktop --browser
root	  7182	0.0  0.6 254520  6192 ?        S    12:57   0:00 nautilus
--no-desktop --browser
root	  7446	0.0  0.0  60264   692 pts/3    R+   13:12   0:00 grep naut
  ie: it is the desktop nautilus that is troublesome.
attachment shows cut down: strace -p 3013 -o strace-nautilus-5secs.txt
and others (ltrace etc).

To me the traces are probably not of much use. I will leave the machine in
thsis state for a while, so please request any "while faulting" information
that might be needed.

Comment 3 Alexander Larsson 2006-10-23 07:03:47 UTC

Could you get a bunch of gdb backtraces to see what nautilus is doing. 

First install nautilus-debuginfo, eel-debuginfo, gnome-vfs2-debuginfo,
gtk2-debuginfo, and glib2-debuginfo, then run "gdb attach <pid of nautilus>".

In gdb, type "thread apply all bt" to get a backtrace, then "c" to continue,
after a (short) while, press ctrl-c and get a new backtrace. Continue this a few
times and collect the backtraces, that way we'll be able to get a feel for what
the app is doing (because its not doing much i/o, so it doesn't show up in strace).

Comment 4 David Timms 2006-10-23 17:02:48 UTC

Created attachment 139146 [details]
a coupla runs of gdb) bt

Had errors with the thread bt version, so tried just bt. It seems the trace is
missing some functions, so I tried to install some more guesses of debuginfo,
but must not have been the correct ones.

Comment 5 Alexander Larsson 2006-10-25 08:18:14 UTC

Woa. Very strange that it didn't get a lot of symbols. However, all the
backtraces are from the same place, you need to do "c" (continue executing) and
ctrl-c to get a different backtrace.

I wonder what all the 
warning: .dynamic section for "/usr/lib64/libgnomevfs-2.so.0" is not at the
expected address
lines are. Maybe prelink ran causing this or something. Could be the cause of
the symbols missing in the backtrace, although some of the pointer values in the
backtrace look weird too.

Comment 6 Bug Zapper 2008-05-14 02:25:21 UTC

Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 7 David Timms 2008-08-16 06:13:34 UTC

I never saw this issue again, and I don't have access to the hardware where it presented itself, to try to confirm it with current F9. I changed the version back to F6, and was going to close it, unless someone wants it open for tracking purposes etc ?

Comment 8 Matěj Cepl 2008-09-02 15:44:20 UTC

David, thanks for helping us with this, but Fedora 6 is way after the End of Support (actually, Fedora 8 is now the last supported Fedora distro).

Closing this bug as CANTFIX

Note You need to log in before you can comment on or make changes to this bug.