Description of problem: metacity crashes and spawns bugbuddy, which can not be seen because metacity is crashed. Windows can not gain focus (I'm using focus-follows-mouse), all mouse clicks are ignored, but if an xterm had focus at the time of the crash, I can spawn gdb and get a backtrace: #0 0x00110416 in __kernel_vsyscall () #1 0x003cf233 in __waitpid_nocancel () from /lib/libc.so.6 #2 0x0059cd67 in g_spawn_sync () from /lib/libglib-2.0.so.0 #3 0x0059d0ac in g_spawn_command_line_sync () from /lib/libglib-2.0.so.0 #4 0x00121253 in ?? () from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so #5 0x0012131e in ?? () from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so #6 0x00121a17 in google_breakpad::ExceptionHandler::InternalWriteMinidump () from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so #7 0x00121e23 in google_breakpad::ExceptionHandler::HandleException () from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so #8 <signal handler called> #9 0x08060f13 in gdk_rectangle_intersect () #10 0x080aaaef in gdk_rectangle_intersect () #11 0x030d6def in ?? () from /usr/lib/libgdk-x11-2.0.so.0 #12 0xbf9c1dc8 in ?? () #13 0x08e72a80 in ?? () #14 0x08e88cf8 in ?? () #15 0x08e56920 in ?? () #16 0x00000001 in ?? () #17 0x03121b80 in g_option_context_set_help_enabled () from /usr/lib/libgdk-x11-2.0.so.0 #18 0xbf9c1be8 in ?? () #19 0x00000000 in ?? () Window decorations don't get redrawn, but applications can still seem to draw into their windows (xterm, system monitor applet). When this originally started happening, I was able to ssh into the machine, and kill bugbuddy and metacity; metacity would be restarted and then would immediately crash again. Version-Release number of selected component (if applicable): metacity --version: metacity 2.22.0 metacity-2.22.0-3.fc9.i386 gdk_rectangle_intersect appears to be defined in libgdk, owned by gtk+: gtk+-1.2.10-61.fc9.i386 gtk+-devel-1.2.10-61.fc9.i386 gtk2 owns /usr/lib/libgdk-x11-2.0.so.0: gtk2-2.12.11-1.fc9.i386 gtk2-devel-2.12.11-1.fc9.i386 g_option_context_set_help_enabled appears to be defined in libglib, owned by glib2: glib2-2.16.5-1.fc9.i386 glib2-devel-2.16.5-1.fc9.i386 How reproducible: Unknown pattern, but happens repeatedly. On both a my Fedora9 desktop (Pentium4) and a MacBookPro with F9 installed on it. I'm usually running a bunch of xterms (up to 6), firefox 3.0.1, pidgin (2.4.3-1.fc9) (with guifications plugin enabled), on boht machines, and opera and xclock on just the Pentium4. Steps to Reproduce: 1. normal work 2. metacity crashes 3. control-alt-backspace to recover, re-login Actual results: Expected results: Additional info: Both machines I've experienced this on are dual head using Xinerama. nvidia drivers: xorg-x11-drv-nvidia-173.14.12-1.lvn9.i386 xorg-x11-drv-nvidia-libs-173.14.12-1.lvn9.i386 uname -r: 2.6.25.14-108.fc9.i686
Created attachment 317166 [details] script output of gdb session against crashed metacity
Forget the above traceback. I got the debuginfo packages installed, and the attached gdb session is much more useful. It seems meta_display_screen_for_root for returning NULL, and there is no NULL check before it is dereferenced in the arguments to meta_workspace_focus_default_window. Not sure what is causing this, or which window is supposed to be the default one. This doesn't seem to happen when I'm switching workspaces. I'm also running imwheel (which I mention because I believe it grabs something in the root window but doesn't have any windows itself).
I'm seeing exactly the same problem - however I'm not running imwheel. I was able to attach to metacity remotely and using gdb, obtained: Program received signal SIGSEGV, Segmentation fault. 0x000000000041c730 in event_callback (event=0x7fff4a530540, data=0x15951d0) at core/display.c:1988 1988 meta_workspace_focus_default_window (new_screen->active_workspace, (gdb) bt #0 0x000000000041c730 in event_callback (event=0x7fff4a530540, data=0x15951d0) at core/display.c:1988 #1 0x0000000000463106 in filter_func (xevent=0x1596bd0, event=<value optimized out>, data=0x839bedb) at ui/ui.c:83 #2 0x000000372ec5418b in gdk_event_apply_filters (xevent=Could not find the frame base for "gdk_event_apply_filters". ) at gdkevents-x11.c:345 #3 0x000000372ec54f0f in gdk_event_translate (display=Could not find the frame base for "gdk_event_translate". ) at gdkevents-x11.c:896 #4 0x000000372ec57a16 in _gdk_events_queue (display=Could not find the frame base for "_gdk_events_queue". ) at gdkevents-x11.c:2285 #5 0x000000372ec57bec in gdk_event_dispatch (source=Could not find the frame base for "gdk_event_dispatch". ) at gdkevents-x11.c:2345 #6 0x000000372c8374db in IA__g_main_context_dispatch (context=<value optimized out>) at gmain.c:2012 #7 0x000000372c83acbd in g_main_context_iterate (context=<value optimized out>, block=<value optimized out>, dispatch=<value optimized out>, self=<value optimized out>) at gmain.c:2645 #8 0x000000372c83b1ed in IA__g_main_loop_run (loop=<value optimized out>) at gmain.c:2853 #9 0x000000000042a9d3 in main (argc=1, argv=0x7fff4a530d58) at core/main.c:476 My setup is a triple-head with 2 NVidia GS7300 cards. This happens at random, usually after 2 or 3 days of uptime for my X server. rpms relevant are: metacity-2.22.0-3.fc9.x86_64 xorg-x11-drv-nvidia-173.14.12-1.lvn9.x86_64 kmod-nvidia-173.14.12-3.lvn9.x86_64 xorg-x11-drv-nvidia-libs-173.14.12-1.lvn9.x86_64 kmod-nvidia-2.6.25.14-108.fc9.x86_64-173.14.12-3.lvn9.x86_64 I also note that this looks quite like bug 461885.
With the latest updates, I'm lucky if metacity survives more than an hour or two before this gets triggered, making it almost unusable. metacity-2.22.0-5.fc9.x86_64 xorg-x11-drv-nvidia-173.14.12-1.lvn9.x86_64 kmod-nvidia-173.14.12-5.lvn9.x86_64 xorg-x11-drv-nvidia-libs-173.14.12-1.lvn9.x86_64 kmod-nvidia-173.14.12-5.lvn9.x86_64 kernel-2.6.26.5-45.fc9.x86_64
I mentioned I'm running imwheel, but this also happens on another install in which I'm not running imwheel.
Created attachment 320258 [details] a patch that avoids this specific null pointer dereference Possible patch, that just puts the use of the variable holding NULL inside a check for NULL. I notice that there are other places in the code that also use the result of meta_display_screen_for_root without checking for NULL. meta_display_screen_for_root can explicitly return NULL, so a possible fix might be to change it so it always returns a valid screen.
I've also applied the attached simple patch and generated a set of new i386 RPMs. All this is is 2.22.0-5.fc9 with this patch applied. They can be downloaded from http://thwartedefforts.org/software/metacity-2.22.0-6aab/ sha1sum: 8866c54e8c5aaa00d8caafb60fe5f76da0b60bd7 metacity-2.22.0-6aab.i386.rpm 7a38c00bbf47c04d80d99ab9041b11d15a6684d2 metacity-2.22.0-6aab.src.rpm e32820220461fe4b8724427dbfddd4563ea9f09f metacity-debuginfo-2.22.0-6aab.i386.rpm 06e0ad2e78349aca70afffef7f5a45d0e6ab4224 metacity-devel-2.22.0-6aab.i386.rpm 93ca3844aeea22c69ed689c57f0ecc18753339a7 metacity-meta_display_screen_for_root-nullderef.patch There are a few other places in the code where the result of meta_display_screen_for_root is used which should be looked at for this same problem by someone who knows metacity internals better than I do (which is just about everyone). It's worth pointing out that it doesn't seem to happen when I'm switching workspaces, but may be triggered when moving the mouse between different (Xinerama) screens. I have not noticed a definite pattern, other that it happens when I am, ahem, working on something I have not saved yet (ain't that always the way?). It doesn't seem to happen when the machine is idle.
Looks like a variant of this patch was applied to metacity 2.24.0.
Looks like the problem around line 1988 of display.c was fixed upstream in revision 3664 http://svn.gnome.org/viewvc/metacity/branches/gnome-2-24/src/core/display.c?view=log&pathrev=3807
I've applied the patch and recompiled metacity. Survived a couple of days this time before it went weird. The problem reappeared, but no bug-buddy this time so I could still switch between virtual displays using the keyboard shortcut, and cycle between windows using alt-tab. However I was unable to focus on anything using the mouse. Note that I too are using focus-follows-mouse.
I've now upgraded to Fedora 10. However the behaviour I reported in comment #10 applies to this release as well. Took about 4 hours before X began to ignore the mouse. TBH, this is making Gnome unusable for anything serious - I suspect I'll trying KDE. Relevant RPMs: metacity-2.24.0-2.fc10.x86_64 xorg-x11-server-Xorg-1.5.3-6.fc10.x86_64 xorg-x11-drv-nvidia-177.82-1.fc10.x86_64 akmod-nvidia-177.82-1.fc10.4.x86_64 kernel-2.6.27.9-159.fc10.x86_64
Chris, your problem in comment #10 has a workaround (which I've applied maybe 3 times today already). I can't remember where I found it. If you use the keyboard to move a window (any window) between Xinerama displays, it fixes itself and mouse buttons are seen again. Alt-tab to a window, Alt-Space to open the window menu, select move, and use the cursor keys to drag the window entirely onto a different screen. I believe this problem is unrelated to the original metacity crashing bug-report, since metacity isn't crashing -- but I guess may be what happens when metacity doesn't crash due to the patch/new-upstream, and it still gets into an error state.
Thanks :-) Just happened to me again and the workaround got the mouse working again. Woo-hoo I'm saved from ctrl-alt-backspace, which has been a too-regular-occurrence over the last 2 or 3 months!
I wonder if this is related to bug 473825 for which a fix has just been released? I shall be updating and finding out if it is...
Looks like my problem at least is now fixed - not reproduced this for the last week or so since updating for bug 473825
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.