Bug 71419

Summary: nautilus getting confused and not starting or not activating existing copies
Product: [Retired] Red Hat Public Beta Reporter: Chris Runge <crunge>
Component: nautilusAssignee: Havoc Pennington <hp>
Status: CLOSED CURRENTRELEASE QA Contact: Jay Turner <jturner>
Severity: medium Docs Contact:
Priority: medium    
Version: nullCC: alexl, srevivo, twaugh, wg
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-11-09 23:35:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 67217    
Attachments:
Description Flags
typescript of gdb session; nautilus has symbols; MALLOC_TRACE set (and mtrace() call added), MALLOC_CHECK_ unset
none
This gdb session is from a nautilus package compiled with a call to mcheck_pedantic() in main(). It's the nautilus process started on login.
none
And another, same conditions.
none
backtrace from tomg@io.com none

Description Chris Runge 2002-08-13 12:37:14 UTC
Description of Problem:

Nautilus doesn't always start in a GNOME session. It seems to only happen when I
start GNOME, exit, and start it again.

ps shows 1 or 2 nautilus processes, e.g.,

2695 ?        R      5:19 nautilus --no-default-window --sm-client-id default4
 2852 ?        S      0:00 nautilus --no-default-window --sm-client-id default4

Version-Release number of selected component (if applicable):

milan beta 4 / limbo beta 2 (updated via RHN)

nautilus-2.0.2-1

How Reproducible:

happens sometimes--unsure of exact pattern

Steps to Reproduce:
1. 
2. 
3. 

Actual Results:


Expected Results:


Additional Information:

Comment 1 Havoc Pennington 2002-08-13 15:39:31 UTC
Can you give this a try with bonobo-activation 1.0.3 and gnome-session 2.0.5 
whenever those wander out to rawhide?

gnome-session is somehow starting apps multiple times, and in the end not
starting them at all, it's very strange...

Comment 2 Havoc Pennington 2002-08-13 22:17:06 UTC
Need to find all the dups of this.

Comment 3 Havoc Pennington 2002-08-14 05:33:39 UTC
Ah, the default session file was all screwed up. Try 
gnome-session 2.0.5-2

Comment 4 Chris Runge 2002-08-17 12:35:04 UTC
I thought this was fixed, but I'm still having problems, now in Null-re0816.3 

$ rpm -q gnome-session
gnome-session-2.0.5-3
$ rpm -q nautilus
nautilus-2.0.4-3

I am in runlevel 3. The system crashed due to what I believe is an SMP lockup.
(not sure if this matters at all or not, but I thought I'd mention it). I reboot
the system and startx. Nautilus does not come up. ps -aex shows this:

1041 tty1     Z      0:00 [gnome-session <defunct>]
1042 ?        R      0:32 nautilus --no-default-window --sm-client-id default3

I then exit gnome/X

at the console I checked and nautilus process 1042 is still there

I then started X again

1042 still there and a new one started:

1042 ?        R      2:55 nautilus --no-default-window --sm-client-id default3
1204 ?        S      0:00 nautilus --no-default-window --sm-client-id default3

but Nautilus still really hasn't started (no icons on the desktop)

I then exited X again to the console and issued the command killall nautilus and
verified all of the nautilus processes were killed

I issues a startx and nautilus was working again

Comment 5 Havoc Pennington 2002-08-17 14:23:53 UTC
Can you attach ~/.gnome2/session please?

Comment 6 Chris Runge 2002-08-17 15:02:22 UTC
~/.gnome2/session doesn't exist:

$ ls -al ~/.gnome2
total 24
drwx------    5 crunge   crunge       4096 Aug 17 02:39 .
drwx------   16 crunge   crunge       4096 Aug 17 07:01 ..
drwx------    2 crunge   crunge       4096 Aug 16 20:34 accels
-rw-rw-r--    1 crunge   crunge         57 Aug 16 21:28 memprof
drwx------    3 crunge   crunge       4096 Aug 16 21:36 panel2.d
drwxr-xr-x    4 crunge   crunge       4096 Aug 16 20:34 share

Comment 7 Havoc Pennington 2002-08-17 16:00:29 UTC
OK, I've seen this too. It doesn't feel like a gnome-session issue to me; I think
it's nautilus or bonobo-activation causing multiple nautilus to start (and none
of them end up managing the desktop).

it may just be nautilus crashing or exiting at a strategic point. perhaps a
recent patch introduced that.

Comment 8 Alexander Larsson 2002-08-19 13:44:22 UTC
Do you ever get a core file from nautilus?


Comment 9 Havoc Pennington 2002-08-19 14:14:49 UTC
coredumpsize=0, but I haven't seen a crash dialog...

Comment 10 Tim Waugh 2002-08-23 11:37:01 UTC
I've seen this too.  Has anyone managed to figure out where nautilus is 
busy-looping (when it busy-loops)?

Comment 11 Havoc Pennington 2002-08-23 13:32:05 UTC
Haven't investigated too much yet. Hard to reproduce. :-/ probably involves
logging out and back in though.

I'm guessing one of the relatively recent nautilus changes broke it, I think I
started seeing it in the last few weeks, so just reviewing those may be a start.

Comment 12 Havoc Pennington 2002-08-23 16:25:53 UTC
I discovered that on the system showing this, "gconftool-2 --get
/apps/nautilus/preferences/add_to_session" printed "false"

How this happened I don't know. But it would be good for other people to check 
whether it happens. If it seems to be happening a lot we should maybe hardcode
that value as it's just a debug setting and an env variable would be more useful
for debugging anyhow.

Comment 13 Havoc Pennington 2002-08-23 19:54:01 UTC
2.0.5-2 will contain a hack to remove the add_to_session setting and always add
to session.

Also, do you have nautilus set to not render the desktop?

Comment 14 Tim Waugh 2002-08-23 19:56:39 UTC
Whenever I've seen this, nautilus has been busy-looping, and when I kill it it 
starts as normal.  So perhaps we aren't seeing the same thing after all. 
 
I have nautilus rendering the desktop, yes.

Comment 15 Havoc Pennington 2002-08-23 20:17:07 UTC
Is the value of 
gconftool-2 --get /apps/nautilus/preferences/add_to_session
"true" for you? Maybe check that when it sticks, just to be sure.

Conceivably also this is just a side-effect of some random memory corruption
also causing #72236 ...

Comment 16 Tim Waugh 2002-08-23 20:27:03 UTC
It's 'true' now, yes, and I haven't ever fiddled with that (nautilus rendering 
desktop).  I'll check it if I see the problem again.

Comment 17 Havoc Pennington 2002-08-24 20:52:43 UTC
72515 is a dup, suggests the cause is the machine crashing (abnormal nautilus exit)

Comment 18 Havoc Pennington 2002-08-25 19:24:33 UTC
Could it be the starthere-hackaround patch? I don't see how. We should remove
that patch anyway, though.

Comment 19 Chris Runge 2002-08-26 12:07:39 UTC
still having the problem under Null

my problem sounds like twaugh's rather than the other issues--the gconftool-2
query returns true

also the problem starts after the system locks up (71738)--so it may indeed be
the fact that it is caused by an improper exit

Comment 20 Alexander Larsson 2002-08-26 16:31:10 UTC
I looked in the changelog a month back or so, and i didn't see anything that
affects startup.


Comment 21 Havoc Pennington 2002-08-27 05:08:49 UTC
If someone can get this to happen under "strace -o output -f nautilus"
that might be useful info.

Also of course a backtrace (even without full symbols) from the stuck nautilus
could help.



Comment 22 Tim Waugh 2002-08-27 08:26:34 UTC
Just happened again.  This is what gdb says: 
 
#0  0x4206fde1 in malloc_consolidate () from /lib/i686/libc.so.6 
#1  0x00000038 in ?? () 
Cannot access memory at address 0x38 
 
Strace shows no output, and neither does ltrace. 
 
'gconftool-2 --get /apps/nautilus/preferences/add_to_session' says 'true'. 
 
nautilus-2.0.5-3 
 
I'll try to keep running for a while if you like, so you can let me know what 
other things I can try.  Do we happen to have a pre-built debug package for 
this yet?

Comment 23 Alexander Larsson 2002-08-27 08:55:12 UTC
How does the other threads look?


Comment 24 Tim Waugh 2002-08-27 09:09:33 UTC
That's the only one running.

Comment 25 Alexander Larsson 2002-08-27 09:18:08 UTC
Are you sure? (milan ps and top hide threads)


Comment 26 Tim Waugh 2002-08-27 09:25:40 UTC
Oh, right, forgot that. 
 
(gdb) info threads 
  10 Thread 8201 (LWP 24108)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  9 Thread 7176 (LWP 24107)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  8 Thread 6151 (LWP 24106)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  7 Thread 5126 (LWP 24105)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  6 Thread 4101 (LWP 24104)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  5 Thread 3076 (LWP 24103)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  4 Thread 2051 (LWP 24102)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  3 Thread 1026 (LWP 24101)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  2 Thread 2049 (LWP 24100)  0x420c3a2b in poll () from /lib/i686/libc.so.6 
* 1 Thread 1024 (LWP 24092)  0x4206fdb5 in malloc_consolidate () 
   from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x4206fdb5 in malloc_consolidate () from /lib/i686/libc.so.6 
#1  0x00000038 in ?? () 
Cannot access memory at address 0x38 
(gdb) thread 2 
[Switching to thread 2 (Thread 2049 (LWP 24100))]#0  0x420c3a2b in poll () 
   from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420c3a2b in poll () from /lib/i686/libc.so.6 
#1  0x408eacce in __pthread_manager () from /lib/i686/libpthread.so.0 
(gdb) thread 3 
[Switching to thread 3 (Thread 1026 (LWP 24101))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) 
(gdb) thread 4 
[Switching to thread 4 (Thread 2051 (LWP 24102))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408e9f8b in pthread_cond_wait () from /lib/i686/libpthread.so.0 
#3  0x4075b3e6 in gnome_vfs_thread_pool_wait_for_work () 
   from /usr/lib/libgnomevfs-2.so.0 
#4  0x4075b43f in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#5  0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#6  0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) thread 5 
[Switching to thread 5 (Thread 3076 (LWP 24103))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) 
(gdb) thread 6 
[Switching to thread 6 (Thread 4101 (LWP 24104))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) thread 7 
[Switching to thread 7 (Thread 5126 (LWP 24105))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) thread 8 
[Switching to thread 8 (Thread 6151 (LWP 24106))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) thread 9 
[Switching to thread 9 (Thread 7176 (LWP 24107))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 
(gdb) thread 10 
[Switching to thread 10 (Thread 8201 (LWP 24108))]#0  0x420285a9 in sigsuspend 
    () from /lib/i686/libc.so.6 
(gdb) bt 
#0  0x420285a9 in sigsuspend () from /lib/i686/libc.so.6 
#1  0x408ecfe8 in __pthread_wait_for_restart_signal () 
   from /lib/i686/libpthread.so.0 
#2  0x408ef330 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 
#3  0x408ebe97 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 
#4  0x42070630 in free () from /lib/i686/libc.so.6 
#5  0x4205e1fc in fclose@@GLIBC_2.1 () from /lib/i686/libc.so.6 
#6  0x40c62210 in read_saved_cached_trash_entries () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#7  0x40c62398 in find_cached_trash_entry_for_device () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#8  0x40c62523 in find_trash_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#9  0x40c62834 in do_find_directory () 
   from /usr/lib/gnome-vfs-2.0/modules/libfile.so 
#10 0x407446dc in gnome_vfs_find_directory_cancellable () 
   from /usr/lib/libgnomevfs-2.so.0 
#11 0x4074c408 in execute_find_directory () from /usr/lib/libgnomevfs-2.so.0 
#12 0x4074c941 in gnome_vfs_job_execute () from /usr/lib/libgnomevfs-2.so.0 
#13 0x4074a269 in thread_routine () from /usr/lib/libgnomevfs-2.so.0 
#14 0x4075b44e in thread_entry () from /usr/lib/libgnomevfs-2.so.0 
#15 0x40935377 in g_thread_create_proxy () from /usr/lib/libglib-2.0.so.0 
#16 0x408eb871 in pthread_start_thread () from /lib/i686/libpthread.so.0 


Comment 27 Alexander Larsson 2002-08-27 09:49:34 UTC
This looks like an allocator deadlock. We're talking with uli on irc about it.


Comment 28 Alexander Larsson 2002-08-27 10:01:06 UTC
Of course, it could also be a random memory corruption bug.


Comment 29 Alexander Larsson 2002-08-27 10:19:04 UTC
<foo> single step a bit more and tell be the value of $ecx every time you reach
0x4206fd70?
<twaugh> foo: Er.. I killed it already.
<foo> darn
<twaugh> Sorry.
<foo> the code is iterating over a list
<foo> it's important to know how many elements are in the list
<twaugh> Next time I see it, I'll try that.
<twaugh> foo: I guess it would be a double-free or something that corrupts it.
<foo> might not be a corruption
<foo> double free: maybe
<foo> arbitrary memory corruption: unlikely
<foo> checking for double-free is easy, enable mtrace (this requires putting
some code at the beginning of main)

Comment 30 Ulrich Drepper 2002-08-27 10:31:33 UTC
~        *fb = 0;
0x4206fd6a <malloc_consolidate+90>:     movl   $0x0,(%eax)
~          check_inuse_chunk(av, p);
~          size = p->size & ~(PREV_INUSE|NON_MAIN_ARENA);
0x4206fd70 <malloc_consolidate+96>:     mov    0x4(%ecx),%eax            17
~          nextp = p->fd;
0x4206fd73 <malloc_consolidate+99>:     mov    0x8(%ecx),%edi            18
~          size = p->size & ~(PREV_INUSE|NON_MAIN_ARENA);
0x4206fd76 <malloc_consolidate+102>:    mov    %eax,%esi                 19
0x4206fd78 <malloc_consolidate+104>:    mov    %edi,0x8(%esp,1)          20
0x4206fd7c <malloc_consolidate+108>:    and    $0xfffffffa,%esi          21
~          nextchunk = chunk_at_offset(p, size);
0x4206fd7f <malloc_consolidate+111>:    lea    (%esi,%ecx,1),%edi        22
~          nextsize = chunksize(nextchunk);
0x4206fd82 <malloc_consolidate+114>:    mov    0x4(%edi),%ebp            23
0x4206fd85 <malloc_consolidate+117>:    mov    %ebp,%edx                 24
0x4206fd87 <malloc_consolidate+119>:    and    $0xfffffff8,%edx          25
~          if (!prev_inuse(p)) {
0x4206fd8a <malloc_consolidate+122>:    and    $0x1,%eax                 26
0x4206fd8d <malloc_consolidate+125>:    mov    %edx,(%esp,1)             27
0x4206fd90 <malloc_consolidate+128>:
~    jne    0x4206fda4 <malloc_consolidate+148>                           28
~            prevsize = p->prev_size;
~            size += prevsize;
~            p = chunk_at_offset(p, -((long) prevsize));
~            unlink(p, bck, fwd);
0x4206fd92 <malloc_consolidate+130>:    mov    (%ecx),%eax
0x4206fd94 <malloc_consolidate+132>:    sub    %eax,%ecx
0x4206fd96 <malloc_consolidate+134>:    mov    0x8(%ecx),%edx
0x4206fd99 <malloc_consolidate+137>:    add    %eax,%esi
0x4206fd9b <malloc_consolidate+139>:    mov    0xc(%ecx),%eax
0x4206fd9e <malloc_consolidate+142>:    mov    %eax,0xc(%edx)
0x4206fda1 <malloc_consolidate+145>:    mov    %edx,0x8(%eax)

~          if (nextchunk != av->top) {
0x4206fda4 <malloc_consolidate+148>:    mov    0x28(%esp,1),%eax         29
0x4206fda8 <malloc_consolidate+152>:    cmp    0x54(%eax),%edi           30
0x4206fdab <malloc_consolidate+155>:
~    je     0x4206fe18 <malloc_consolidate+264>                           31
~            nextinuse = inuse_bit_at_offset(nextchunk, nextsize);
0x4206fdad <malloc_consolidate+157>:    mov    (%esp,1),%edx             32
~            if (!nextinuse) {
0x4206fdb0 <malloc_consolidate+160>:    testb  $0x1,0x4(%edx,%edi,1)     33
0x4206fdb5 <malloc_consolidate+165>:
~    jne    0x4206fe10 <malloc_consolidate+256>                           34
~              size += nextsize;
~              unlink(nextchunk, bck, fwd);
0x4206fdb7 <malloc_consolidate+167>:    mov    0x8(%edi),%ebp
0x4206fdba <malloc_consolidate+170>:    add    %edx,%esi
0x4206fdbc <malloc_consolidate+172>:    mov    0xc(%edi),%eax
0x4206fdbf <malloc_consolidate+175>:    mov    %eax,0xc(%ebp)
0x4206fdc2 <malloc_consolidate+178>:    mov%ebp,0x8(%eax)
~            first_unsorted = unsorted_bin->fd;
~            unsorted_bin->fd = p;
~            first_unsorted->bk = p;

~            set_head(p, size | PREV_INUSE);
~            p->bk = unsorted_bin;
~            p->fd = first_unsorted;
~            set_foot(p, size);
0x4206fdc5 <malloc_consolidate+181>:    mov    %esi,(%esi,%ecx,1)        3
0x4206fdc8 <malloc_consolidate+184>:    mov    0x4(%esp,1),%eax          4
0x4206fdcc <malloc_consolidate+188>:    mov    %esi,%edx                 5
0x4206fdce <malloc_consolidate+190>:    or     $0x1,%edx                 6
0x4206fdd1 <malloc_consolidate+193>:    mov    0x8(%eax),%edi            7
0x4206fdd4 <malloc_consolidate+196>:    mov    %ecx,0x8(%eax)            8
0x4206fdd7 <malloc_consolidate+199>:    mov    0x4(%esp,1),%eax          9
0x4206fddb <malloc_consolidate+203>:    mov    %ecx,0xc(%edi)            10
0x4206fdde <malloc_consolidate+206>:    mov    %edx,0x4(%ecx)            11
0x4206fde1 <malloc_consolidate+209>:    mov    %eax,0xc(%ecx)            12
0x4206fde4 <malloc_consolidate+212>:    mov    %edi,0x8(%ecx)            13
~        } while ( (p = nextp) != 0);
0x4206fde7 <malloc_consolidate+215>:    mov    0x8(%esp,1),%ecx          14
0x4206fdeb <malloc_consolidate+219>:    test   %ecx,%ecx                 15
0x4206fded <malloc_consolidate+221>:
~    jne    0x4206fd70 <malloc_consolidate+96>                            16
0x4206fdef <malloc_consolidate+223>:    mov    0x10(%esp,1),%eax
0x4206fdf3 <malloc_consolidate+227>:    addl   $0x4,0x10(%esp,1)
0x4206fdf8 <malloc_consolidate+232>:    cmp    0xc(%esp,1),%eax
0x4206fdfc <malloc_consolidate+236>:
~    jne    0x4206fd5c <malloc_consolidate+76>
0x4206fe02 <malloc_consolidate+242>:    add    $0x14,%esp
0x4206fe05 <malloc_consolidate+245>:    pop    %ebx
0x4206fe06 <malloc_consolidate+246>:    pop    %esi
0x4206fe07 <malloc_consolidate+247>:    pop    %edi
0x4206fe08 <malloc_consolidate+248>:    pop    %ebp
0x4206fe09 <malloc_consolidate+249>:    ret
0x4206fe0a <malloc_consolidate+250>:    lea    0x0(%esi),%esi
	      clear_inuse_bit_at_offset(nextchunk, 0);
0x4206fe10 <malloc_consolidate+256>:    and    $0xfffffffe,%ebp          35
0x4206fe13 <malloc_consolidate+259>:    mov    %ebp,0x4(%edi)            1
0x4206fe16 <malloc_consolidate+262>:
~    jmp    0x4206fdc5 <malloc_consolidate+181>                           2

Comment 31 Alexander Larsson 2002-08-27 11:00:57 UTC
<foo> alex, twaugh: the malloc maintainer thinks that it /could/ be a double free
<foo> alex, twaugh: so, keep on trying to MALLOC_DEBUG_

Comment 32 Tim Waugh 2002-08-27 13:18:58 UTC
I have another trace.  This time, mtrace() was called in nautilus-main.c:main, 
and debugging symbols are intact in the nautilus package (that I built). 
 
(gdb) info threads 
  9 Thread 7176 (LWP 9863)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  8 Thread 6151 (LWP 9862)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  7 Thread 5126 (LWP 9861)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  6 Thread 4101 (LWP 9860)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  5 Thread 3076 (LWP 9859)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  4 Thread 2051 (LWP 9858)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  3 Thread 1026 (LWP 9857)  0x420285a9 in sigsuspend () 
   from /lib/i686/libc.so.6 
  2 Thread 2049 (LWP 9856)  0x420c3a2b in poll () from /lib/i686/libc.so.6 
* 1 Thread 1024 (LWP 9846)  0x408ebcea in pthread_mutex_trylock () 
   from /lib/i686/libpthread.so.0 
(gdb) bt 
#0  0x408ebcea in pthread_mutex_trylock () from /lib/i686/libpthread.so.0 
#1  0x302b3063 in ?? () 
#2  0x4206ee20 in malloc () from /lib/i686/libc.so.6 
#3  0x40922d39 in g_malloc () from /usr/lib/libglib-2.0.so.0 
#4  0x40932263 in g_strsplit () from /usr/lib/libglib-2.0.so.0 
#5  0x4070e530 in ltable_insert () from /usr/lib/libgconf-2.so.4 
#6  0x4070e202 in gconf_listeners_add () from /usr/lib/libgconf-2.so.4 
#7  0x4071ff45 in gconf_client_notify_add () from /usr/lib/libgconf-2.so.4 
#8  0x40118ecc in eel_gconf_notification_add () from /usr/lib/libeel-2.so.2 
#9  0x40143700 in preferences_entry_ensure_gconf_connection () 
   from /usr/lib/libeel-2.so.2 
#10 0x08094d0e in fm_directory_view_init (view=0x8236598) 
    at fm-directory-view.c:1342 
#11 0x408c943b in g_type_create_instance () from /usr/lib/libgobject-2.0.so.0 
#12 0x408b364f in g_object_constructor () from /usr/lib/libgobject-2.0.so.0 
#13 0x408b2e5e in g_object_newv () from /usr/lib/libgobject-2.0.so.0 
#14 0x408b361f in g_object_new_valist () from /usr/lib/libgobject-2.0.so.0 
#15 0x408b2c16 in g_object_new () from /usr/lib/libgobject-2.0.so.0 
#16 0x0805f9f0 in create_object (servant=0x812ec04, 
    iid=0x408dde68 "\204m\003", ev=0xbffff7a0) at nautilus-application.c:119 
#17 0x407708c1 in _ORBIT_skel_small_Bonobo_GenericFactory_createObject () 
   from /usr/lib/libbonobo-activation.so.4 
#18 0x407a3267 in ORBit_POAObject_invoke () from /usr/lib/libORBit-2.so.0 
#19 0x407a7275 in ORBit_OAObject_invoke () from /usr/lib/libORBit-2.so.0 
#20 0x40796093 in ORBit_small_invoke_adaptor () from /usr/lib/libORBit-2.so.0 
#21 0x407a3741 in ORBit_POAObject_handle_request () 
   from /usr/lib/libORBit-2.so.0 
#22 0x407a3a71 in ORBit_POA_handle_request () from /usr/lib/libORBit-2.so.0 
#23 0x407a717c in ORBit_handle_request () from /usr/lib/libORBit-2.so.0 
#24 0x40791b55 in giop_connection_handle_input () from 
/usr/lib/libORBit-2.so.0#25 0x4089a8dd in linc_connection_io_handler () from 
/usr/lib/liblinc.so.1 
#26 0x4089c640 in linc_source_dispatch () from /usr/lib/liblinc.so.1 
#27 0x4091cf65 in g_main_dispatch () from /usr/lib/libglib-2.0.so.0 
#28 0x4091df98 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0 
#29 0x4091e2ad in g_main_context_iterate () from /usr/lib/libglib-2.0.so.0 
#30 0x4091ea1f in g_main_loop_run () from /usr/lib/libglib-2.0.so.0 
#31 0x4043b39f in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0 
#32 0x0806834a in main (argc=1093664800, argv=0xbffffbc4) 
    at nautilus-main.c:265 
#33 0x420155c4 in __libc_start_main () from /lib/i686/libc.so.6 


Comment 33 Havoc Pennington 2002-08-27 13:45:37 UTC
Are we confident that with current libc if it were a double free,
MALLOC_CHECK_=2 would still make the allocator abort() in the second free? Since
this is hanging 
in malloc() then if we were confident of that, we could rule out a double free
presumably.

Comment 34 Tim Waugh 2002-08-27 13:55:28 UTC
I'm running this MALLOC_CHECK_ unset now, in favour of mtrace.  But we lost 
the mtrace log because rpmq calls mtrace() itself (and I happened to run rpm 
-q). 
 
When we had the mtrace log, there was no report of a double-free according to 
mtrace(1).

Comment 35 Tim Waugh 2002-08-27 14:13:29 UTC
Created attachment 73325 [details]
typescript of gdb session; nautilus has symbols; MALLOC_TRACE set (and mtrace() call added), MALLOC_CHECK_ unset

Comment 36 Alexander Larsson 2002-08-27 14:30:00 UTC
thread 1 is looping in arena_get2(). Looks like this loop:
  /* Check the global, circularly linked list for available arenas. */
 repeat:
  do {
    if(!mutex_trylock(&a->mutex)) {
      THREAD_STAT(++(a->stat_lock_loop));
      tsd_setspecific(arena_key, (Void_t *)a);
      return a;
    }
    a = a->next;
  } while(a != a_tsd);

  /* If not even the list_lock can be obtained, try again.  This can
     happen during `atfork', or for example on systems where thread
     creation makes it temporarily impossible to obtain _any_
     locks. */
  if(mutex_trylock(&list_lock)) {
    a = a_tsd;
    goto repeat;
  }

This seems to indicate that all arenas are locked and so is the list_lock. This
could be right since on of the other threads is blocked in fork. In fact in 
ptmalloc_lock_all() wich is called at_fork to grab all the thread locks. Strange
that it blocked though.

This seem to be *another* deadlock.


Comment 37 Tim Waugh 2002-08-27 15:23:51 UTC
Haven't managed to reproduce the problem with MALLOC_CHECK_=2.  Still trying.

Comment 38 Tim Waugh 2002-08-27 17:42:07 UTC
I tried about 100 times.  Then I got my fiancie to try.  First time, she got a 
core file. :-) 
 
#0  0x42028501 in kill () from /lib/i686/libc.so.6 
#1  0x408edf3d in raise () from /lib/i686/libpthread.so.0 
#2  0x420298dc in abort () from /lib/i686/libc.so.6 
#3  0x42071368 in free_check () from /lib/i686/libc.so.6 
#4  0x420705c5 in free () from /lib/i686/libc.so.6 
#5  0x0806841e in nautilus_navigation_bar_unimplemented_get_location () 
#6  0x420155c4 in __libc_start_main () from /lib/i686/libc.so.6 
 
In other words, this is the second call of free() for the same object.

Comment 39 Tim Waugh 2002-08-27 18:20:23 UTC
This time with symbols:  
  
#0  0x42028501 in kill () from /lib/i686/libc.so.6  
#1  0x408edf3d in raise () from /lib/i686/libpthread.so.0  
#2  0x420298dc in abort () from /lib/i686/libc.so.6  
#3  0x42071368 in free_check () from /lib/i686/libc.so.6  
#4  0x420705c5 in free () from /lib/i686/libc.so.6  
#5  0x0806841e in nautilus_navigation_bar_unimplemented_get_location ()  
    at nautilus-navigation-bar.c:48  
#6  0x420155c4 in __libc_start_main () from /lib/i686/libc.so.6  
  
not that it particularly helps.  But it's here:  
  
EEL_IMPLEMENT_MUST_OVERRIDE_SIGNAL (nautilus_navigation_bar, get_location)  
  
Core file (from nautilus-2.0.5-3) at ~twaugh/nautilus-core.

Comment 40 Havoc Pennington 2002-08-27 18:25:00 UTC
This isn't looking promising:

#define EEL_IMPLEMENT_MUST_OVERRIDE_SIGNAL(prefix, signal)                    \
                                                                              \
static void                                                                   \
prefix##_unimplemented_##signal (void)                                        \
{                                                                             \
	g_warning ("failed to override signal " #prefix "->" #signal);        \
}

No call to free in there...

Comment 41 Tim Waugh 2002-08-27 18:27:27 UTC
Groan.  I'd built an unstripped nautilus binary and just pointed gdb at that.  
Guess that doesn't work. 
 
I'll build an unstripped package and install that.  Hopefully I'll be able to 
get a core from it again.

Comment 42 Tim Waugh 2002-08-27 18:31:46 UTC
... however, even when I pointed gdb at the actual executable that dumped 
core, it said that it was nautilus_navigation_bar_unimplemented_get_location 
at fault. 
 
Does g_warning do _no_ memory allocation?

Comment 43 Havoc Pennington 2002-08-27 18:45:06 UTC
g_warning does do memory allocation, however there should be a few stack frames 
in there, probably g_log, g_logv, g_free at least.


Comment 44 Tim Waugh 2002-08-28 10:53:29 UTC
I compiled nautilus with a call to mcheck_pedantic(abort) as the first thing 
in main().  No core dump after two runs.

Comment 45 Tim Waugh 2002-08-28 11:02:36 UTC
crunge.com: Do you know a way to reproduce this bug at will?

Comment 46 Alexander Larsson 2002-08-28 11:21:12 UTC
bug 70873 (smb hang) has the same backtrace (malloc_consolidate).


Comment 47 Tim Waugh 2002-08-28 11:58:00 UTC
Created attachment 73507 [details]
This gdb session is from a nautilus package compiled with a call to mcheck_pedantic() in main().  It's the nautilus process started on login.

Comment 48 Tim Waugh 2002-08-28 12:06:08 UTC
Created attachment 73508 [details]
And another, same conditions.

Comment 49 Owen Taylor 2002-08-30 22:50:30 UTC
I think I might have a handle on this ... it looks like two threads
in gnome-vfs are processing a single list without locking and both
trying to free the same element.

Need to investigate how it *should* work further.


Comment 50 Owen Taylor 2002-08-31 00:00:32 UTC
gnome-vfs2-2.0.2-5 seems to fix the problem I was seeing, and looks
likely to fix the above, though there certainly coudl be something
else going on as well.

I'm going to mark it MODIFIED; Testing would be very much appreciated.

 (I've put the package at 

   http://people.redhat.com/otaylor/tmp/gnome-vfs2-test

  until rawhide propagates)



Comment 51 tom georgoulias 2002-09-01 18:39:46 UTC
I think the crashing/unresponsive behavior I'm seeing is related.  The easiest
way for me to recreate it is to have Mozilla running, then try to open a
nautilus window by double clicking on the home icon on my desktop.  The icon
will stay highlighted, nautilus never opens a file manager window, and a minute
or so later the icons disappear.  Sometimes they reappear much later, as if
nautilus respawned after a time out period, but the new icons don't respond to
double clicking.  This is the process that normally hangs around, although I
have seen the throbber running as well: 

[tomg@gemini tomg]$ ps -ef | grep nautilus 
tomg       875     1  0 13:14 ?        00:00:00 nautilus --sm-config-prefix /nau 

FWIW, when nautilus is on the fritz like this I cannot start gedit from the
applications menu either.  

I attached the running process to gdb and tried to get a backtrace, but I'm not
sure that the trace has anything that hasn't been seen before.  I've attached it
anyway, just in case. 

I installed the latest nautilus from rawhide and the gnome-vfs2 linked above,
but neither solved the problem. 

[tomg@gemini tomg]$ rpm -q gnome-vfs2 
gnome-vfs2-2.0.2-5 
[tomg@gemini tomg]$ rpm -q gnome-vfs2-devel 
gnome-vfs2-devel-2.0.2-5 
[tomg@gemini tomg]$ rpm -q nautilus 
nautilus-2.0.6-1 
[tomg@gemini tmp]$ rpm -q gnome-session
gnome-session-2.0.5-4

Comment 52 tom georgoulias 2002-09-01 18:40:34 UTC
Created attachment 74430 [details]
backtrace from tomg

Comment 53 Wolfram Gloger 2002-09-02 14:02:04 UTC
Hmm, if it really was two threads freeing the same chunk (of a list), I think
MALLOC_CHECK_=2 should
have produced a core every time. However, note that MALLOC_CHECK_ does affect
the timing,
because allocation is effectively single-threaded when using it...

If you continue to reproduce the hang in malloc_consolidate, please let me know
and I'll try
to build the nautilus beta (I got stuck at building glib2.0 with debian patches
already :-( ).


Comment 54 Jay Turner 2002-09-04 12:15:25 UTC
OK, looks like this is still ongoing, so I'll just leave it sitting in Modified
for a while.

Comment 55 Chris Runge 2002-11-09 23:35:46 UTC
I haven't seen this problem on Psyche gold.

Comment 56 Alexander Larsson 2002-11-11 11:10:45 UTC
The gnome-vfs change seems to have fixed it indeed.