Bug 165879 - grab_broken events are delivered to destroyed windows, causing crash
Summary: grab_broken events are delivered to destroyed windows, causing crash
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: gtk2
Version: rawhide
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Robin Green
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-08-13 12:41 UTC by Robin Green
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-08-19 18:13:16 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNOME Bugzilla 313953 0 None None None Never

Description Robin Green 2005-08-13 12:41:06 UTC
Description of problem:
When you start eclipse for the first time in a new workspace, it displays the
Welcome screen. If you then click on the button which takes you to the
Workbench, in the top right corner, eclipse then crashes. This has been
occurring for quite some time now.

Version-Release number of selected component (if applicable):
eclipse-platform-3.1.0_fc-6

How reproducible:
Always

Steps to Reproduce:
1. mkdir empty-workspace
2. eclipse -data empty-workspace -vmargs
-Dgnu.gcj.precompiled.db.path=/usr/lib/gcj-4.0.1/classmap.db
(the -vmargs is necessary to workaround bug  on recent libgcj packages.)
3. Wait for eclipse to start
4. Click on the arrow in the top right corner of the Welcome panel to enter the
workbench
  
Actual results:
Workbench starts to display, but then eclipse crashes.

Expected results:
No crash.

Additional info:

- When you try re-running eclipse in the same workspace, the welcome panel is
not shown again, so the problem does not re-occur for that workspace.

- Setting MOZ_DISABLE_PANGO=1 makes no difference.

- Here is the backtrace from gdb:

#0  0x0371d2cb in IA__gtk_widget_get_toplevel (widget=0xae05c00) at gtkwidget.c:6151
#1  0x03622b78 in gtk_main_get_window_group (widget=Variable "widget" is not
available.
) at gtkmain.c:1462
#2  0x03622caa in IA__gtk_main_do_event (event=0xa9e8718) at gtkmain.c:1286
#3  0x03bccb7e in Java_org_eclipse_swt_internal_gtk_OS__1gtk_1main_1do_1event ()
   from
/usr/share/eclipse/configuration/org.eclipse.osgi/bundles/75/1/.cp/libswt-pi-gtk-3138.so
#4  0x04c7d7ca in org.eclipse.swt.internal.gtk.OS._gtk_main_do_event(int) () at
OS.java:4771
#5  0x04c7d852 in org.eclipse.swt.internal.gtk.OS.gtk_main_do_event(int)
(event=171104160) at OS.java:4777
#6  0x04d17e17 in org.eclipse.swt.widgets.Display.eventProc(int, int)
(this=0x5b04cc0, event=178161432, data=0) at Display.java:1067
#7  0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#8  0x02a9ca90 in ffi_call (cif=0xbfa4f210, fn=0x4d17c84
<org.eclipse.swt.widgets.Display.eventProc(int, int)>, rvalue=0xbfa4f228,
    avalue=0xbfa4f150) at ../../../libffi/src/x86/ffi.c:221
#9  0x02736f71 in _Jv_CallAnyMethodA (obj=0x5b04cc0, return_type=0x3046b40,
meth=0x4f109fc, is_constructor=0 '\0',
    is_virtual_call=1 '\001', parameter_types=0x1067e90, args=0xbfa4f280,
result=0xbfa4f2c0, is_jni_call=1 '\001', iface=0x0)
    at ../../../libjava/java/lang/reflect/natMethod.cc:516
#10 0x026fb85e in _Jv_JNI_CallAnyMethodV<jint, normal> (env=0xa37bc78,
obj=0x5b04cc0, klass=Variable "klass" is not available.
) at ../../../libjava/jni.cc:813
#11 0x07f5ae6f in callback () from
/usr/share/eclipse/configuration/org.eclipse.osgi/bundles/75/1/.cp/libswt-gtk-3138.so
#12 0x07f48bae in fn2_2 () from
/usr/share/eclipse/configuration/org.eclipse.osgi/bundles/75/1/.cp/libswt-gtk-3138.so
#13 0x01e7595f in gdk_event_dispatch (source=0xa32d7a0, callback=0,
user_data=0x0) at gdkevents-x11.c:2291
#14 0x01f7a07e in IA__g_main_context_dispatch (context=0xa0f4770) at gmain.c:1934
#15 0x01f7d096 in g_main_context_iterate (context=0xa0f4770, block=0,
dispatch=1, self=0xa4be3e0) at gmain.c:2565
#16 0x01f7d578 in IA__g_main_context_iteration (context=0xa0f4770, may_block=0)
at gmain.c:2624
#17 0x03bc5752 in
Java_org_eclipse_swt_internal_gtk_OS__1g_1main_1context_1iteration ()
   from
/usr/share/eclipse/configuration/org.eclipse.osgi/bundles/75/1/.cp/libswt-pi-gtk-3138.so
#18 0x04c56b48 in org.eclipse.swt.internal.gtk.OS._g_main_context_iteration(int,
boolean) () at OS.java:1152
#19 0x04c56be8 in org.eclipse.swt.internal.gtk.OS.g_main_context_iteration(int,
boolean) (context=171104160, may_block=false)
    at OS.java:1158
#20 0x04d1b4ae in org.eclipse.swt.widgets.Display.readAndDispatch()
(this=0x5b04cc0) at Display.java:2570
#21 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#22 0x02a9cda2 in ffi_raw_call (cif=0xb26fd370, fn=0x4d1b464
<org.eclipse.swt.widgets.Display.readAndDispatch()>, rvalue=0xbfa4f7ec,
    fake_avalue=0xbfa4f580) at ../../../libffi/src/x86/ffi.c:537
#23 0x0270bbb4 in _Jv_InterpMethod::run (this=0x52c29c0, retp=0xbfa4f9f4,
args=0xbfa4fa14) at ../../../libjava/interpret.cc:1206
#24 0x0270f3e9 in _Jv_InterpMethod::run_normal (ret=0xa32d7a0, args=0xa32d7a0,
__this=0xa32d7a0) at ../../../libjava/interpret.cc:278
#25 0x02a9cc39 in ffi_closure_raw_SYSV (closure=Variable "closure" is not available.
) at ../../../libffi/src/x86/ffi.c:416
#26 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#27 0x02a9cda2 in ffi_raw_call (cif=0xb2728ef0, fn=0x5243b80, rvalue=0xbfa4fd30,
fake_avalue=0xbfa4fab0)
    at ../../../libffi/src/x86/ffi.c:537
#28 0x0270bbb4 in _Jv_InterpMethod::run (this=0x58663a8, retp=0xbfa4ff38,
args=0xbfa4ff58) at ../../../libjava/interpret.cc:1206
#29 0x0270f3e9 in _Jv_InterpMethod::run_normal (ret=0xa32d7a0, args=0xa32d7a0,
__this=0xa32d7a0) at ../../../libjava/interpret.cc:278
#30 0x02a9cc39 in ffi_closure_raw_SYSV (closure=Variable "closure" is not available.
) at ../../../libffi/src/x86/ffi.c:416
#31 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#32 0x02a9cda2 in ffi_raw_call (cif=0x5ff5280, fn=0x64fea8, rvalue=0xbfa5024c,
fake_avalue=0xbfa4ffe0)
    at ../../../libffi/src/x86/ffi.c:537
#33 0x0270bbb4 in _Jv_InterpMethod::run (this=0x5377240, retp=0xbfa50454,
args=0xbfa50474) at ../../../libjava/interpret.cc:1206
#34 0x0270f352 in _Jv_InterpMethod::run_class (ret=0xa32d7a0, args=0xa32d7a0,
__this=0x5377240) at ../../../libjava/interpret.cc:303
#35 0x02a9cc39 in ffi_closure_raw_SYSV (closure=Variable "closure" is not available.
) at ../../../libffi/src/x86/ffi.c:416
#36 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#37 0x02a9cda2 in ffi_raw_call (cif=0x5ea5fa0, fn=0x440b60, rvalue=0xbfa5076c,
fake_avalue=0xbfa50500)
    at ../../../libffi/src/x86/ffi.c:537
#38 0x0270bbb4 in _Jv_InterpMethod::run (this=0x1467ed8, retp=0xbfa50974,
args=0xbfa50994) at ../../../libjava/interpret.cc:1206
#39 0x0270f352 in _Jv_InterpMethod::run_class (ret=0xa32d7a0, args=0xa32d7a0,
__this=0x1467ed8) at ../../../libjava/interpret.cc:303
#40 0x02a9cc39 in ffi_closure_raw_SYSV (closure=Variable "closure" is not available.
) at ../../../libffi/src/x86/ffi.c:416
#41 0x03275508 in
org.eclipse.ui.internal.ide.IDEApplication.run(java.lang.Object) (this=0x37e4d8,
args=0x113bda8)
    at IDEApplication.java:103
#42 0x00d777a8 in
org.eclipse.core.internal.runtime.PlatformActivator$1.run(java.lang.Object)
(this=0x6d33e0, arg=0x113bda8)
    at PlatformActivator.java:226
#43 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#44 0x02a9cda2 in ffi_raw_call (cif=0x5325e50, fn=0xd7765e
<org.eclipse.core.internal.runtime.PlatformActivator$1.run(java.lang.Object)>,
    rvalue=0xbfa50d3c, fake_avalue=0xbfa50ac0) at ../../../libffi/src/x86/ffi.c:537
#45 0x0270bbb4 in _Jv_InterpMethod::run (this=0x52900, retp=0xbfa50f44,
args=0xbfa50f64) at ../../../libjava/interpret.cc:1206
#46 0x0270f352 in _Jv_InterpMethod::run_class (ret=0xa32d7a0, args=0xa32d7a0,
__this=0x52900) at ../../../libjava/interpret.cc:303
#47 0x02a9cc39 in ffi_closure_raw_SYSV (closure=Variable "closure" is not available.
) at ../../../libffi/src/x86/ffi.c:416
#48 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#49 0x02a9cda2 in ffi_raw_call (cif=0x5325f70, fn=0xd2070, rvalue=0xbfa51288,
fake_avalue=0xbfa51000) at ../../../libffi/src/x86/ffi.c:537
#50 0x0270bbb4 in _Jv_InterpMethod::run (this=0xaa700, retp=0xbfa51490,
args=0xbfa514b0) at ../../../libjava/interpret.cc:1206
#51 0x0270f352 in _Jv_InterpMethod::run_class (ret=0xa32d7a0, args=0xa32d7a0,
__this=0xaa700) at ../../../libjava/interpret.cc:303
#52 0x02a9cc39 in ffi_closure_raw_SYSV (closure=Variable "closure" is not available.
) at ../../../libffi/src/x86/ffi.c:416
#53 0x02a9cde3 in ffi_call_SYSV () at ../../../libffi/src/x86/sysv.S:60
#54 0x02a9ca90 in ffi_call (cif=0xbfa515c8, fn=0xd20e0, rvalue=0xbfa515e0,
avalue=0xbfa51520) at ../../../libffi/src/x86/ffi.c:221
#55 0x02736f71 in _Jv_CallAnyMethodA (obj=0xf7e40, return_type=0x2e0e940,
meth=0x3a603c, is_constructor=0 '\0', is_virtual_call=0 '\0',
    parameter_types=0x2cbd0, args=0xbfa51640, result=0xbfa516b0, is_jni_call=0
'\0', iface=0x0)
    at ../../../libjava/java/lang/reflect/natMethod.cc:516
#56 0x02737549 in _Jv_CallAnyMethodA (obj=0xf7e40, return_type=0x2e0e940,
meth=0x3a603c, is_constructor=0 '\0', parameter_types=0x2cbd0,
    args=0x2cbe0, iface=0x0) at ../../../libjava/java/lang/reflect/natMethod.cc:651
#57 0x027379d9 in java::lang::reflect::Method::invoke (this=0x53dc0,
obj=0xf7e40, args=0xa32d7a0)
    at ../../../libjava/java/lang/reflect/natMethod.cc:193
#58 0x003696db in
org.eclipse.core.launcher.Main.invokeFramework(java.lang.String[],
java.net.URL[]) (this=0x3a5f50,
    passThruArgs=0xa7f00, bootPath=0xa32d7a0) at
org/eclipse/core/launcher/Main.java:334
#59 0x0036914f in org.eclipse.core.launcher.Main.basicRun(java.lang.String[])
(this=0x3a5f50, args=0xbcf00)
    at org/eclipse/core/launcher/Main.java:279
#60 0x0036ca9b in org.eclipse.core.launcher.Main.run(java.lang.String[])
(this=0x3a5f50, args=0xa32d7a0)
    at org/eclipse/core/launcher/Main.java:973
#61 0x0036c9e8 in org.eclipse.core.launcher.Main.main(java.lang.String[])
(args=0xa32d7a0) at org/eclipse/core/launcher/Main.java:948
#62 0x02723e1d in gnu::java::lang::MainThread::call_main (this=0x77dc8) at
../../../libjava/gnu/java/lang/natMainThread.cc:47
#63 0x027b8932 in gnu.java.lang.MainThread.run() (this=0x77dc8) at
../../../libjava/gnu/java/lang/MainThread.java:105
#64 0x0273381d in _Jv_ThreadRun (thread=0x77dc8) at
../../../libjava/java/lang/natThread.cc:289
#65 0x026f7345 in _Jv_RunMain (vm_args=0xa32d7a0, klass=0x0, name=0xbfa53254
"/usr/share/eclipse/startup.jar", argc=23, argv=0xbfa51b60,
    is_jar=true) at ../../../libjava/prims.cc:1353
#66 0x00310a2d in main (argc=26, argv=0xbfa51b54) at ../../../libjava/gij.cc:336
#67 0x001a249f in __libc_start_main (main=0x8048414 <main>, argc=26,
ubp_av=0xbfa51b54, init=0x80484f8 <__libc_csu_init>,
    fini=0x8048554 <__libc_csu_fini>, rtld_fini=0x17dc9d <_dl_fini>,
stack_end=0xbfa51b4c) at ../sysdeps/generic/libc-start.c:231
#68 0x08048475 in _start ()

Comment 1 Robin Green 2005-08-14 01:45:06 UTC
It might be necessary to maximize the window before pressing the button (I
normally do that reflexively, so I didn't think to include it in the instructions).

After debugging with gdb and grinding with valgrind I came up with this culprit
from valgrind:

==17409==
==17409== Invalid read of size 4
==17409==    at 0x3ACF1A12: gdk_window_get_user_data (gdkwindow.c:517)
==17409==    by 0x3AA7A90E: gtk_get_event_widget (gtkmain.c:2045)
==17409==    by 0x3AA7AC1E: gtk_main_do_event (gtkmain.c:1260)
==17409==    by 0x3A935B7D:
Java_org_eclipse_swt_internal_gtk_OS__1gtk_1main_1do_1event (in
/usr/lib/eclipse/libswt-pi-gtk-3138.so)
==17409==    by 0x3975F7C9:
org::eclipse::swt::internal::gtk::OS::_gtk_main_do_event(int) (OS.java:4771)
==17409==    by 0x3975F851:
org::eclipse::swt::internal::gtk::OS::gtk_main_do_event(int) (OS.java:4777)
==17409==    by 0x397F9E16: org::eclipse::swt::widgets::Display::eventProc(int,
int) (Display.java:1067)
==17409==    by 0x34ABE1EA: ffi_call_SYSV (sysv.S:60)
==17409==    by 0x34ABDF67: ffi_call (ffi.c:221)
==17409==    by 0x347583A0:
_Z18_Jv_CallAnyMethodAPN4java4lang6ObjectEPNS0_5ClassEP10_Jv_MethodbbP6JArrayIS4_EP6jvalueSB_bS4_
(natMethod.cc
:516)
==17409==    by 0x3471CC7D: ??? (jni.cc:813)
==17409==    by 0x3AD9AE6E: callback (in /usr/lib/eclipse/libswt-gtk-3138.so)
==17409==  Address 0x36B08968 is 0 bytes inside a block of size 72 free'd
==17409==    at 0x3414E743: free (vg_replace_malloc.c:152)
==17409==    by 0x195EC23: g_free (in /usr/lib/libglib-2.0.so.0.701.5)
==17409==    by 0x1915BD4: g_type_free_instance (in
/usr/lib/libgobject-2.0.so.0.701.5)
==17409==    by 0x18FA575: g_object_unref (in /usr/lib/libgobject-2.0.so.0.701.5)
==17409==    by 0x18FB051: g_object_run_dispose (in
/usr/lib/libgobject-2.0.so.0.701.5)
==17409==    by 0x3AA9EEEE: gtk_object_destroy (gtkobject.c:363)
==17409==    by 0x3AB755F9: gtk_widget_destroy (gtkwidget.c:1995)
==17409==    by 0x3F5489AB: (within
/usr/lib/mozilla-1.7.11/components/libwidget_gtk2.so)
==17409==    by 0x3F548C62: (within
/usr/lib/mozilla-1.7.11/components/libwidget_gtk2.so)
==17409==    by 0x3F548D22: (within
/usr/lib/mozilla-1.7.11/components/libwidget_gtk2.so)
==17409==    by 0x3F558696: (within
/usr/lib/mozilla-1.7.11/components/libwidget_gtk2.so)
==17409==    by 0x3F54402D: (within
/usr/lib/mozilla-1.7.11/components/libwidget_gtk2.so)

In other words, Mozilla is disposing of a widget, but a stale reference is being
held to that widget in the user_data field of a GdkWindowObject. This stale
reference then points to a different object, so garbage is read and a crash ensues.

(Although Mozilla also uses GTK2, note that from the point of view of SWT, it's
effectively a foreign, embedded application. So that probably explains why the
user_data field is not being cleared.)

Unfortunately, mozilla 1.7.11 is not supported by eclipse yet - but it's what's
shipped in rawhide, so it's what rawhide eclipse uses. I will attempt to
reproduce this bug with a supported mozilla and report back.

Comment 2 Robin Green 2005-08-14 12:29:02 UTC
This bug _is_ reproducable with MOZILLA_FIVE_HOME set to Mozilla 1.7.3. But it
is _not_ reproducable on the Sun JDK 1.4.2_08 for me.

That's odd, because this bug appears to be related to native code. It could be a
timing-related issue.

Comment 3 Robin Green 2005-08-14 14:41:06 UTC
I forgot to mention. The valgrind output above is NOT with vanilla gtk. I had a
couple of debug statements inserted at the time, to debug this bug, as follows.
So user_data is being read slightly earlier than it would otherwise have been.
However, this doesn't cast doubt on my analysis.

#define RDG_SANITY_CHECK(user_data)   if ((user_data) != NULL &&
!GTK_IS_WIDGET((user_data))) g_assert_not_reached ()

...

void
gdk_window_set_user_data (GdkWindow *window,
			  gpointer   user_data)
{
  g_return_if_fail (window != NULL);
  
  RDG_SANITY_CHECK(((GdkWindowObject*)window)->user_data = user_data);
  
  g_log ("RDG", G_LOG_LEVEL_DEBUG, "%p.user_data=%p", window, user_data);
}

void
gdk_window_get_user_data (GdkWindow *window,
			  gpointer  *data)
{
  g_return_if_fail (window != NULL);
  
  RDG_SANITY_CHECK (*data = ((GdkWindowObject*)window)->user_data);
  
  g_log ("RDG", G_LOG_LEVEL_DEBUG, "%p.user_data=%p", window, *data);
}


Comment 4 Robin Green 2005-08-15 00:36:21 UTC
I found a similar - but not the same - bug here:

https://bugzilla.mozilla.org/show_bug.cgi?id=144212

It seems that Mozilla should be _clearing_ the GTK_NO_WINDOW flag of a widget
that it gives you. If you use GtkMozEmbed, then this is done in
gtk_moz_embed_init. But SWT does _not_ use GtkMozEmbed. It uses nsWebBrowser
instead (via a public interface), which uses nsWindow, which uses mozcontainer.c
as its toplevel container.

mContainer (a mozcontainer) is the only thing that gets passed to
gtk_widget_destroy in any of those three files, in the case of SWT. So I suspect
that that's the culprit, and that if I clear the GTK_NO_WINDOW flag in
moz_container_init, that will prevent this crash.

Testing a patch.

Comment 5 Robin Green 2005-08-15 23:54:23 UTC
When I just attempt to clear GTK_NO_WINDOW on the mContainer widget,
gdk_window_destroy does get called for that widget's window. However,
gdk_window_impl_x11_finalize does not get called for that window in time to
prevent the crash, which it should be AFAICS. Investigating.

Comment 6 Robin Green 2005-08-16 15:01:24 UTC
(In reply to comment #5)
> When I just attempt to clear GTK_NO_WINDOW on the mContainer widget,
> gdk_window_destroy does get called for that widget's window. However,
> gdk_window_impl_x11_finalize does not get called for that window in time to
> prevent the crash, which it should be AFAICS.

Actually, that should not matter, because gdk_event_translate checks for a
destroyed GDK window in an event and returns NULL if it finds one (unless the
event is a DestroyNotify event - which the event that causes the crash isn't).

There are exactly six more GDK windows which reference the mContainer widget in
their user_data, and it is sixth one which is now causing the crash (and might
have been causing the crash all along, I'm not sure). I think it is crashing
because that sixth GDK window is not destroyed as it should be. The relevant
statement that sets up the reference is:

#1  0x04993ca7 in moz_drawingarea_create_windows (drawingarea=0x9cbced0,
parent=0xa7a0bd8, widget=0xa36b8a0) at mozdrawingarea.c:155
155         gdk_window_set_user_data(drawingarea->inner_window, widget);

Testing another patch.

Comment 7 Robin Green 2005-08-16 16:41:35 UTC
(In reply to comment #6)
> Actually, that should not matter, because gdk_event_translate checks for a
> destroyed GDK window in an event and returns NULL if it finds one

Except that this event is synthetic, so gdk_event_translate is _not_ called for
it. I think I've found the root cause. The new grab_broken events are created
abnormally, so the usual check for !WINDOW_DESTROYED is bypassed.
!WINDOW_DESTROYED is still checked, but only when the event is created, not when
it is later picked off the event queue. Which would explain why it didn't occur
on a faster vm and sometimes didn't occur on libgcj - timing issue.

I will test with gtk2 2.8.0 as soon as I can, and if that upgrade doesn't fix
it, I will try patching it and then report it upstream.

Comment 8 Robin Green 2005-08-19 13:08:22 UTC
This exact same bug is now preventing me from starting firefox. (I use the  
SessionSaver extension, which triggered this bug after restarting firefox  
after a crash.) So it's definitely not eclipse-specific. Moving to gtk2  
component and assigning to myself to show that I'm working on it.  

Comment 9 Robin Green 2005-08-19 14:53:27 UTC
Filed upstream @ bugzilla.gnome.org, #313953   

Comment 10 Robin Green 2005-08-19 18:13:16 UTC
Fixed upstream.


Note You need to log in before you can comment on or make changes to this bug.