Bug 163692

Summary: Firefox crashes when clicking inside dialog box (and more)
Product: [Fedora] Fedora Reporter: Mickey Stein <yekkim>
Component: firefoxAssignee: Christopher Aillon <caillon>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
URL: any - not important
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-08-14 01:49:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Results of a $pmap -x pid{firefox}
none
A strace log of firefox abnormally exiting after an add bookmark click.
none
Strace of the actual /usr/lib/firefox-1.1/firefox-bin rather than the shellscript
none
Patch for simliar stack trace problem from gnome bugzilla. none

Description Mickey Stein 2005-07-20 13:06:27 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b3) Gecko/20050718 Fedora/1.1-0.2.1.deerpark.alpha2 Firefox/1.0+

Description of problem:
This is not just firefox, but apparently all those packages originating from mozilla.org (firefox-1.1-0.2.1.deerpark.alpha2, thunderbird-1.0.6-0.1.fc5, and mozilla-1.7.8-6.

Symptom: When inside a dialog box of any kind (filters, help/about:firefox, Preferences, etc) clicking on any button "can" cause an abnormal exit. This includes the top-right corner close-window "X". 

Tried: If I launch firefox/thunderbird from a console prompt, they're apparently more stable. If I strace them, they're even more stable, but when the problem does  occur, it leaves no indication that isn't in a normal run. Just exits without even a message to strace or console. 

I think these 3 packages became unstable around 1 week ago (rawhide time) and its clear that the 3 packages are all releated (all use gtk2, cairo etc), but one of them (Mozilla) was not updated for quite awhile on my system. It still got the problem when I did last week's update of gtk2 and many other things, but I've not come close to pinpointing which component this bug is really about yet. 

Just wanted to get this entry started. I'm now fully up to date on rawhide and having the problems regularly. I can't duplicate these in something vaguely similar like Konquerer or any other app. I use kde & roll my own kernels each update. Currently on *git4 of 2.6.13-rc3.



Version-Release number of selected component (if applicable):
firefox-1.1-0.2.1.deerpark.alpha2

How reproducible:
Sometimes

Steps to Reproduce:
1. Enter firefox, thunderbird or mozilla
2. Open a dialog box (Help/about firefox will do)
3. Click any button or close it. Do this until a crash occurs. 
  

Actual Results:  Abnormal exit of the browser/email etc. No trace of what caused the problem.

Expected Results:  It should have performed the action requested by whatever button was clicked upon.

Additional info:

There's no error message I've yet found. I listed the versions of each symptomatic package above, but here's some others I think could be in the mix:

gtk2-2.7.3-1
GConf2-2.11.1-1.i386
cairo-0.5.2-1.i386
(pango for some reason)

Themes: Extensions: Profiles:

These tests were made with and without 3rd part themes/extensions, with and without new profiles. I'll attach a pmap -x {firefox}

Comment 1 Mickey Stein 2005-07-20 13:12:29 UTC
Created attachment 116974 [details]
Results of a $pmap -x pid{firefox}

I mentioned in the reporting note that I'd attach this.

Comment 2 Lars G 2005-07-20 19:33:54 UTC
for me, trying to save anything from the web crashes ff.

Comment 3 Paul F. Johnson 2005-07-21 14:14:26 UTC
clinking on a JS link or even a normal link can kill it for me

Comment 4 Mickey Stein 2005-07-22 20:57:43 UTC
Created attachment 117080 [details]
A strace log of firefox abnormally exiting after an add bookmark click.

I'd been trying to make the exits of firefox, tbird & mozilla be caught by
strace but it happens a little less often when you'd like it to.  I got one
just now that is fairly repeatable. The log is attached. It was firefox
(today's or yesterday's rawhide) and I was clicking the cancel button on 'add
bookmark'. It appears to have gone off course ~Line 131 in
/usr/lib/firefox-1.1/run-mozilla.sh and strace shows at that point a series of
"(Gecko:23723): Gtk-CRITICAL" errors. I'm trying to add some bash output lines
to that area and see if MOZ_PROGRAM (what it's looking at) is malformed, null
or some such bad thing.

Comment 5 Mickey Stein 2005-07-22 21:13:39 UTC
Created attachment 117084 [details]
Strace of the actual /usr/lib/firefox-1.1/firefox-bin rather than the shellscript

Sorry. By launching firefox ($strace firefox -o NNN), I was just tracing the
shellscript invokation and exit, so I marked the strace before this one
obsolete. 

This one is the same abnormal exit type and is a trace of the actual   
"/usr/lib/firefox-1.1/firefox-bin". I'm not sure its more useful, but am going
through it now. All I can see for certain is that it segfaults.

Comment 6 Anders Kaseorg 2005-08-01 19:20:04 UTC
For reference, the easiest way to run firefox in a debugger is to close all
firefox windows, then run
  firefox -g  (to run in gdb)
  firefox -g -d another_debugger  (to run in another_debugger, such as strace)
It is often most useful to run in gdb, type "run" at the "(gdb)" prompt, wait
for a crash, then type "bt" to get a stack trace.

Comment 7 Anders Kaseorg 2005-08-01 19:30:11 UTC
...which results in the following, indicating a GTK problem:

Program received signal SIGSEGV, Segmentation fault.

#0  0x00630a3b in gtk_widget_get_toplevel () from /usr/lib/libgtk-x11-2.0.so.0
#1  0x00536478 in gtk_propagate_event () from /usr/lib/libgtk-x11-2.0.so.0
#2  0x005365aa in gtk_main_do_event () from /usr/lib/libgtk-x11-2.0.so.0
#3  0x0039e395 in gdk_screen_get_setting () from /usr/lib/libgdk-x11-2.0.so.0
#4  0x007c8b7e in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#5  0x007cbb86 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#6  0x007cbe73 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#7  0x0053712f in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#8  0x00e918d6 in ?? () from /usr/lib/firefox-1.1/components/libwidget_gtk2.so
#9  0x00ee1568 in NSGetModule ()
   from /usr/lib/firefox-1.1/components/libtoolkitcomps.so
#10 0x0804f3a0 in ?? ()
#11 0x0804af15 in ?? ()
#12 0x00b5643f in __libc_start_main () from /lib/libc.so.6
#13 0x0804ae71 in ?? ()

Comment 8 Mickey Stein 2005-08-02 01:52:34 UTC
Nice. I'd guess that the window of time I started seeing this in was around
20Jul which would make this /d2/src/updates/yum/19Jul/gtk2-2.7.3-1.i386.rpm or
this /d2/src/updates/yum/09Jul/gtk2-2.7.2-1.i386.rpm the prior version. I'll see
how many thousands of dependencies it'd take to roll that back in. 

Meanwhile I just got back, and of course, my hours-long debug make of firefox
blewout on a pointer error :). Brings back memories of when I used to build this
constantly. 

Ah well, will try again tomorrow and also look over the gtk possibilities. 

Comment 9 Mickey Stein 2005-08-02 04:54:47 UTC
My firefox build blowout was just due to some changes in the way
--enable-extension works now in the .mozconfig, so it's building ok after a
little cleanup of .mozconfig. 

I rolled gtk2 back to 09Jul/gtk2-2.7.2-1.i386.rpm (and the associated gtk2-devel
version as well). 

Using your trace, I noted the following:

$ rpm -ql gtk2-devel | grep libgtk

/usr/lib/libgtk-x11-2.0.so

---

So, I've kicked off a debug build of firefox with gtk2-devel of 09-Jul (which
is, I hope, prior to the beginning of these problems. If not, I've got one
around that's a little ealier without causing a lib* namechange which would
screwup a load of dependencies. 

I'll try to check this out tomorrow morning again.

BTW: Was that a static or dynamic build? The thing that I wonder about is
whether /usr/lib/libgtk-x11-2.0.so.0 is dragged in at runtime or whether it's
linked into the image statically. My last build was dynamic and I built it with
the latest gtk (oops) and ran it against the old one, and it was just as
unstable, but I think either way, it's linked against the old gtk2-devel at
build time (via gtk2-config, etc).

For tracking, it'd be useful to know when the first version of deerpark arrived
in rawhide. I only keep a few versions back, so I can't recall but
firefox-1.1-0.2.3.deerpark arrived on 21Jul. The first I noticed this was within
a day of the first version of deerpark on rawhide. 



Comment 10 Brian Gerst 2005-08-02 12:18:24 UTC
It crashes for me alot when attempting to allow or deny cookies. 

Comment 11 Mickey Stein 2005-08-02 12:36:03 UTC
Allowing or Denying cookies (or anything else) is the same basic action
(clicking buttons), so it seems expected. Do you keep a history of .rpms
replaced by yum? If so, can you recall when it started and what packages were
installed at that time? 

-- I tried a ./firefox -g debug session with:

a) no pango or cairo
b) gtk2 + devel (09-Jul) linked into the build.
c) unfortunately I only could yank the current cvs for the rest.
d) results follow, identical (basically) to Anders' #7 post.

---- debug session result ---

(Gecko:3634): Gtk-CRITICAL **: gtk_widget_event: assertion `GTK_IS_WIDGET (widge
t)' failed
--DOMWINDOW == 20
--WEBSHELL == 20
--WEBSHELL == 19
++WEBSHELL == 20
++DOMWINDOW == 21
++WEBSHELL == 21
++DOMWINDOW == 22
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: nsTimeout::Release() proceeding without context., file nsGlobalWindow.c
pp, line 5480
--WEBSHELL == 20

(Gecko:3634): Gtk-CRITICAL **: gtk_widget_get_toplevel: assertion `GTK_IS_WIDGET
 (widget)' failed

(Gecko:3634): Gtk-CRITICAL **: gtk_widget_event: assertion `GTK_IS_WIDGET (widge
t)' failed
--DOMWINDOW == 21
--DOMWINDOW == 20
--WEBSHELL == 19
++WEBSHELL == 20
++DOMWINDOW == 21
++WEBSHELL == 21
++DOMWINDOW == 22
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: Deleting out of flow without tearing down placeholder relationship, fil
e nsFrame.cpp, line 641
WARNING: nsTimeout::Release() proceeding without context., file nsGlobalWindow.c
pp, line 5480
--WEBSHELL == 20

Program ./firefox-bin (pid = 3634) received signal 11.
Stack:
UNKNOWN [./firefox-bin +0x0001C180]
__kernel_sigreturn+0x00000000 [ +0x00000420]
gtk_widget_get_toplevel+0x00000038 [/usr/lib/libgtk-x11-2.0.so.0 +0x00222F4E]
UNKNOWN [/usr/lib/libgtk-x11-2.0.so.0 +0x0012CA18]
gtk_main_do_event+0x000000DB [/usr/lib/libgtk-x11-2.0.so.0 +0x0012CB4A]
UNKNOWN [/usr/lib/libgdk-x11-2.0.so.0 +0x00042195]
g_main_context_dispatch+0x000001DC [/usr/lib/libglib-2.0.so.0 +0x00024B7E]
UNKNOWN [/usr/lib/libglib-2.0.so.0 +0x00027B86]
g_main_loop_run+0x000001A1 [/usr/lib/libglib-2.0.so.0 +0x00027E73]
gtk_main+0x000000B4 [/usr/lib/libgtk-x11-2.0.so.0 +0x0012D6CF]
UNKNOWN [/d2/src/firebird/mozilla/dist/bin/components/libwidget_gtk2.so +0x00019
A72]
UNKNOWN [/d2/src/firebird/mozilla/dist/bin/components/libtoolkitcomps.so +0x0000
B16D]
UNKNOWN [./firefox-bin +0x00009096]
UNKNOWN [./firefox-bin +0x00003314]
__libc_start_main+0x000000C5 [/lib/libc.so.6 +0x0001543D]
Sleeping for 5 minutes.
Type 'gdb ./firefox-bin 3634' to attach your debugger to this thread.

----- end of debug ----

note: the last couple lines instructions were followed but ff was quite dead by
this time. I'll try one more revert to the earliest version of gtk I can fit in
without mangling the system.


Comment 12 Mickey Stein 2005-08-02 13:35:50 UTC
Last take on this for awhile due to work: 

Mozilla/ff/tb ./configure using toolkit gtk2 probably does a 'pkg-config
gtk2+-x11-2.0' to figure out which libs & flags to use in the build. All the
gtk2's from rawhide from 09Jul to now return the following:

$ pkg-config gtk+-x11-2.0 --libs
-L/usr/X11R6/lib -lgtk-x11-2.0 -lgdk-x11-2.0 -latk-1.0 -lgdk_pixbuf-2.0 -lm
-lpangoxft-1.0 -lpangocairo-1.0 -lpangox-1.0 -lpangoft2-1.0 -lfreetype -lz
-lcairo -lpango-1.0 -lgobject-2.0 -lgmodule-2.0 -ldl -lglib-2.0 -lfontconfig
-lpixman -lXrender -lX11 -lXext -lpng12
[root@Kathaldo mozilla]# pkg-config gtk+-x11-2.0 --cflags
-DXTHREADS -D_REENTRANT -DXUSE_MTSAFE_API -I/usr/include/gtk-2.0
-I/usr/lib/gtk-2.0/include -I/usr/X11R6/include -I/usr/include/atk-1.0
-I/usr/include/freetype2 -I/usr/include/cairo -I/usr/include/pango-1.0
-I/usr/include/freetype2/config -I/usr/include/libpng12 -I/usr/include/glib-2.0
-I/usr/lib/glib-2.0/include

This kind of ties cairo/pango in regardless of what ./configure flags I send to
firefox other than telling it not to use the gtk2 toolkit(what else is there?
gtk1 I suppose). I'm guessing that the sign of one that predates this problem
might not return cairo since cairo is relatively new to the rawhide tree(is this
true?) I don't recall ever seeing cairo until pango yanked it in during an
update one day. I didn't have it on my system. This was in that nebulous
timeframe of 1-2 months back. 

Having it segfault in gtk2 doesn't mean that the problem is there. It could be
in any number of places. It could be an xorg update that the mozilla.org apps
aren't at peace with. Could be glib, etc. 

I think, for me anyway and unless someone here figures out that its a 'rawhide'
issue for sure, that bubbling this over to a) gnu.org b) freedesktop.org and
checking for bugs filed there is worthwhile. It seems kind of premature that
I've just assumed this is a rawhide problem only. I haven't even searched the
forums over at http://forums.mozillazine.org/ which would be a natural place for
people /builders to run into it. 


Comment 13 Brian Gerst 2005-08-03 12:25:43 UTC
Another crash backtrace: 
 
Program received signal SIGSEGV, Segmentation fault.  
[Switching to Thread 46912501359472 (LWP 1107)]  
0x0000003e416041ee in IA__gtk_widget_get_toplevel (widget=0x1bf1050) at  
gtkwidget.c:6151  
6151      g_return_val_if_fail (GTK_IS_WIDGET (widget), NULL);  
(gdb) bt  
#0  0x0000003e416041ee in IA__gtk_widget_get_toplevel (widget=0x1bf1050) at  
gtkwidget.c:6151  
#1  0x0000003e415242f2 in gtk_main_get_window_group (widget=Variable "widget"  
is not available.  
) at gtkmain.c:1460  
#2  0x0000003e415243f8 in IA__gtk_main_do_event (event=0xa897e0) at  
gtkmain.c:1284  
#3  0x0000003e41b45bb8 in gdk_event_dispatch (source=Variable "source" is not  
available.  
) at gdkevents-x11.c:2295  
#4  0x0000003e3d825ffe in IA__g_main_context_dispatch (context=0x5529c0) at  
gmain.c:1934  
#5  0x0000003e3d828c94 in g_main_context_iterate (context=0x5529c0, block=1,  
dispatch=1, self=Variable "self" is not available.  
) at gmain.c:2565  
#6  0x0000003e3d829180 in IA__g_main_loop_run (loop=0x81fad0) at gmain.c:2769  
#7  0x0000003e41524e4d in IA__gtk_main () at gtkmain.c:972  
#8  0x00002aaaaee98512 in ?? ()  
from /usr/lib64/firefox-1.1/components/libwidget_gtk2.so  
#9  0x00002aaaaf0de6c0 in NSGetModule ()  
from /usr/lib64/firefox-1.1/components/libtoolkitcomps.so  
#10 0x000000000040c501 in ?? ()  
#11 0x0000003a3951cc2f in __libc_start_main () from /lib64/libc.so.6  
#12 0x0000000000407cf9 in ?? ()  
#13 0x00007fffffb0c8c8 in ?? ()  
#14 0x0000000000000000 in ?? ()  
  

Comment 14 Christopher Aillon 2005-08-03 14:13:05 UTC
This is a cairo bug.  Fixed in today's rawhide.

*** This bug has been marked as a duplicate of 164664 ***

Comment 15 Mickey Stein 2005-08-03 15:27:18 UTC
Created attachment 117409 [details]
Patch for simliar stack trace problem from gnome bugzilla. 

No, this has nothing to do with the cairo bug. That was only for cairo
assertion crashes as in the bug you mentioned. I can crash firefox clicking on
a button in a dialog box with the new cairo no problem, so this is good for the
first issue in 164664, but this isn't a dup of that. 

I searched upstream in bugzilla.gnome and found a very similar stack trace here
at:

http://bugzilla.gnome.org/show_bug.cgi?id=309505 . 

The patch for this is in another gnome bug and is attached here. 

I build firefox after applying this patch to   
	       mozilla/widget/src/gtk2/mozdrawingarea.c

and haven't yet been able to crash it again. It'd be nice if some of the others
able to build their own firefox/tb/mozilla could see what their results are
using the patch.

Comment 16 Brian Gerst 2005-08-05 11:40:24 UTC
I can confirm that the patch in comment #15 does fix the crash when setting cookies.

Comment 17 Mickey Stein 2005-08-05 12:34:29 UTC
Ok, well it sounds good for most people. Maybe Christopher is right and that's
the answer. I may just be one of the few that get the stack trace that ends with
as the one's above do on certain button clicks with the gtkwidget trace, but
that's now solved with the little patch in #15. I'll close this out again and
move it over to gnome & mozilla.org where its still alive and well. 

It doesn't really seem like a fedora issue anymore now that the cairo issue is
ok. If the problem resurfaces it won't matter and will be gone as soon as the
patch bubbles through mozilla (its awaiting super-approval) and is used in
future versions of the mozilla.org trio,

Thanks,

Mick