55119 – Panel crashes on login

Bug 55119 - Panel crashes on login

Summary: Panel crashes on login

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	gnome-core
Sub Component:
Version:	7.2
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Havoc Pennington
QA Contact:	Aaron Brown
Docs Contact:
URL:
Whiteboard:
Duplicates (6):	52496 55276 56023 57003 58297 65309 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-10-25 20:05 UTC by Kjartan Maraas
Modified:	2007-04-18 16:37 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-12-14 02:46:21 UTC
Embargoed:

Attachments	(Terms of Use)
output saved from bug-buddy (3.03 KB, text/plain) 2001-10-25 20:05 UTC, Kjartan Maraas	no flags	Details
gdb backtrace, while Panel crash dialog still open (1.66 KB, text/plain) 2002-03-23 02:41 UTC, Steve Bonneville	no flags	Details
View All

Description Kjartan Maraas 2001-10-25 20:05:11 UTC

Description of Problem:

Got a new laptop and installed 7.2 on it. On first login the panel crashed.
This could be some race condition that's dependent on CPU speed / amount of
 work being done on first login etc.

Version-Release number of selected component (if applicable):

gnome-core-1.4.0.4-38

How Reproducible:

Steps to Reproduce:
1. log in on first boot after install
2. 
3. 

Actual Results:

Crash in the panel (attached output from bug-buddy)

Expected Results:

No crash

Additional Information:

I'm attaching the info saved from bug-buddy

Comment 1 Kjartan Maraas 2001-10-25 20:05:50 UTC

Created attachment 35121 [details]
output saved from bug-buddy

Comment 2 Havoc Pennington 2001-10-25 21:06:50 UTC

Yuck, looks like some sort of memory corruption caused by gwmh getting in some
specific state - most likely totally unreproduceable...

Comment 3 Havoc Pennington 2001-10-30 03:42:08 UTC

see also http://bugzilla.gnome.org/show_bug.cgi?id=63352 and the Red Hat bug 
backlinked from there

Comment 4 Andy Wang 2001-10-31 03:05:40 UTC

In case it helps, I can reproduce this reliably on my Dell Inspiron 5000 laptop
(P3-650).  It happens much more reliably when the power is connected (i.e.
running at 650).  With the power disconnected (speedstep kicks in and running at
500mhz) it is far less likely to do this.

Comment 5 Havoc Pennington 2001-10-31 03:31:25 UTC

Oh, it is just dawning on me that this is all happening on laptops. Doh. I know
this bug. ;-) 

On some laptops with a bad BIOS, apps that touch /proc/apm will segfault,
at least a lot of the time. IIRC. Something along those lines. I know it was
some bug with buggy BIOS and apps segfaulting trying to use apm.

If you do "cat /proc/apm" does cat segfault? Can I get laptop make/model for
people seeing the bug?

Comment 6 Andy Wang 2001-10-31 05:57:42 UTC

cat /proc/apm doesn't segfault.  I did it about 100 times, wrote a script that
did it 1000 times.
On a whim, I disabled the battery monitor (I didn't use it when I was running
RH 7.1) and the panel still crashed.  So that wasn't it.  Without the battery
monitor, the panel shouldn't be touching /proc/apm correct?
I haven't generated a bug-buddy report since I haven't seen the need to e-mail
it, but if it would be helpful I can.

Comment 7 Havoc Pennington 2001-10-31 14:47:47 UTC

The thing is, the panel touches /proc/apm itself to decide whether to run the
battery monitor by default. I don't think failure of cat to segfault necessarily
invalidates the theory, since cat is such a simple app. There may be some more
complicated trigger than just touching the file. I'll go ask a kernel person today.

Comment 8 John Iselin 2001-11-09 16:12:02 UTC

I've a Dell Inspiron 8000 on which I've run RH7.0 for almost a year without a
hitch.  I upgraded the bios (maybe bad move) to the A17 release from Dell.  I
then immediately installed RH7.2.  Possible due to this same probelm reported
here I've re-installed the OS opproximately 8 time in the last week.

In addition to the panel crashing which it does after some installs but not
others I've had the following problems some of the time:
* gnome_terminal won't start (segmentation fault)
* under gnome: the shared libraries have been corrupted (can't ls -l them. 
Read/write error)
* under gnome: the panel.d directory has been corrupted (can't ls -l them. 
Read/write error)
* under KDE: the icon .png files have been corrupted (can't ls -l them. 
Read/write error)
* under KDE: can't read the icon.png files (file ownership group, filesize, and
date tag are all wrong)

I've made sure to repartion the drive with each install and check for bad
blocks.  I've tried using an ext2 filesystems instead of ext3.  I've tried
running KDE instead of gnome.  None of these permutations helped.  I just
installed RH7.1 to see if that makes a difference.  So far so good but its been
running < 1 hour.

I haven't narrowed it down to whether the problem is in starting X, rebooting
the machine, with the bios, or with the laptop management apps, but it is seems
pretty clear that my problem and maybe the originators problem is with the file
system.

Comment 9 Need Real Name 2001-12-06 07:48:41 UTC

I have similar experience on both a laptop and a desktop computer. So I do not
think either it is related to apm. The panel crashes randomly when opening a new
session, and especially when the computer was freshly booted.

When the panel crashes, I can restart it by hand, by just typing 'panel&' in
some terminal window.

It then gives a warning which is maybe useful:

Gtk-WARNING **: gtk_signal_disconnect_by_data(): could not find handler
containing data (0x818A2A8)

Doesn't it mean that the panel is trying to access some memory area pointed to
by an uninitialised pointer ? Sometimes the area is in a protected page and
kaboom, sometimes not: the pointee does not contain meaningful information and
the panel issues the above warning.

Just a guess.

Comment 10 Earle R. Nietzel 2002-01-22 20:38:12 UTC

I have the same problem exact.

I was able to get around it by deleteing all the .gnome* configuration and then
letting it be recreated on the next login.

After things were recreated I never had this problem again.

Sorry I didn't try to figure exactly what configuration file(s) were causing the
problems.

Comment 11 jeff 2002-02-08 23:20:56 UTC

Ok, bugs 55119, 55276 and 58297 here look like the same problem (which I have
also). If you check at gnome.org for
http://bugzilla.gnome.org/show_bug.cgi?id=59500 or
http://bugzilla.gnome.org/show_bug.cgi?id=69333 you can see this problem is
reported by a LOT of people and looks specific to RH7.2.

Comment 12 Kjartan Maraas 2002-02-09 10:12:28 UTC

Look at bugzilla.gnome.org for about 500 reports of the same crash :(

Red Hat really needs to update their gnome-core package to the latest release,
if  not we'll end up having the same problem with the next release...

There's been no crash reports for 1.4.0.6, and all people asked to try this
release have said that it fixed it for them. One small lead is that it seems to
happen only with accounts created during install. If they remove the account and
create a new one all is fine. (This has to be equivalent to 'rm ~/.gnome maybe?)

Kjartan

Comment 13 Havoc Pennington 2002-02-09 14:46:15 UTC

I spent a couple hours going through all the dups on gnome.org last night.
There are several distinct crashes in there. The vast majority of people didn't
include a backtrace in the report, so I don't really know which one is "the"
crash. :-/

Anyhow, I'll either get the new gnome-core or carefully sort through the 
patches since our current gnome-core and apply some of them.

Comment 14 Havoc Pennington 2002-02-25 22:56:56 UTC

*** Bug 52496 has been marked as a duplicate of this bug. ***

Comment 15 Havoc Pennington 2002-02-25 23:09:28 UTC

*** Bug 57003 has been marked as a duplicate of this bug. ***

Comment 16 Havoc Pennington 2002-02-26 00:49:41 UTC

*** Bug 55276 has been marked as a duplicate of this bug. ***

Comment 17 Havoc Pennington 2002-02-26 15:09:59 UTC

*** Bug 58297 has been marked as a duplicate of this bug. ***

Comment 18 Havoc Pennington 2002-02-26 20:34:20 UTC

*** Bug 56023 has been marked as a duplicate of this bug. ***

Comment 19 Need Real Name 2002-03-11 21:25:52 UTC

This bug seen on RH7.0 , RH7.1, RH7.2
on Dell Inspiron 5000e and 7500  (ie ATI Rage Mobitiy P and N chipsets)
Also on various Acer Servers using ATI Mach64 2/4 Meg RAM on-boards chipsets
All systems fixed by doing CTRL-ALT-BACKSPACE before login
Fault was cleared on RH7.0 by updating to latest Ximian Gnome 1.4 (oops!)
Inspiron 7500 has seemingly braindamaged APM hardware - so forget it. 5000e
seems ok.
Neither Dell system probes well via Xconfigurator.

Comment 20 Havoc Pennington 2002-03-18 18:02:44 UTC

No closer to reproducing this or figuring out the problem :-/

Comment 21 Havoc Pennington 2002-03-20 15:55:01 UTC

I got another lead from someone - can people post their "xdpyinfo" output? 
Or just note whether you are using an 8-bit or other less-common bit depth?

Comment 22 Havoc Pennington 2002-03-20 16:12:25 UTC

8-bit does work here (I would have been surprised if it didn't I guess), but I'm
still wondering if it has something to do with the specific X visual.

Comment 23 Andy Wang 2002-03-22 20:22:09 UTC

I'm running 24-bit.  I'll try 8 and 16 bit tonight when I get home to see if
there is any change with the problem.

Comment 24 Steve Bonneville 2002-03-23 02:41:36 UTC

Created attachment 49804 [details]
gdb backtrace, while Panel crash dialog still open

Comment 25 Havoc Pennington 2002-03-23 02:59:44 UTC

Excellent! If you can reproduce it, is there any chance you could do "export
MALLOC_CHECK_=2" in /etc/profile or somewhere?

That backtrace is inside g_malloc() which means memory got corrupted somehow;
MALLOC_CHECK_=2 may convince it to crash closer to the root cause of the problem.
Note trailing underscore in the env variable name.

Comment 26 Havoc Pennington 2002-03-23 14:59:26 UTC

Another dup just hit gnome.org:
http://bugzilla.gnome.org/show_bug.cgi?id=76037
Again it happens only on an Inspiron for them.

I went through a hundred or so of the gnome.org dups a couple weeks ago, the 
create_menu_at() backtrace in that bug and the 
gdk_window_foreign_new() backtrace on this bug are the most common traces.
However neither backtrace is actually telling us enough to fix the problem 
without debug symbols/MALLOC_CHECK_/etc.

Comment 27 Havoc Pennington 2002-07-07 02:46:30 UTC

*** Bug 65309 has been marked as a duplicate of this bug. ***

Comment 28 Kjartan Maraas 2002-07-23 23:41:22 UTC

Just for completeness. Here's a backtrace with MALLOC_CHECK_=2. This is with
16-bit colors and a Rage Mobility P/M AGP 2x card.

This is also running the very latest gnome-core from CVS.

#0  0x404c8e82 in gdk_window_add_filter () from /usr/lib/libgdk-1.2.so.0
#1  0x0805baec in task_new (window=0x8131b10) at gwmh.c:1726
#2  0x0805be74 in client_list_sync (xwindow_ids=0x815d690, n_ids=1)
    at gwmh.c:1860
#3  0x0805ac7d in gwmh_desk_update (imask=GWMH_DESK_INFO_CLIENT_LIST)
    at gwmh.c:1100
#4  0x0805b53e in gwmh_idle_handler (data=0x0) at gwmh.c:1490
#5  0x404e7ddc in g_idle_dispatch (source_data=0x805b500, 
    dispatch_time=0xbffff670, user_data=0x0) at gmain.c:1367
#6  0x404e6e41 in g_main_dispatch (dispatch_time=0xbffff670) at gmain.c:656
#7  0x404e7445 in g_main_iterate (block=1, dispatch=1) at gmain.c:877
#8  0x404e75d4 in g_main_run (loop=0x81bce00) at gmain.c:935
#9  0x403e888f in gtk_main () from /usr/lib/libgtk-1.2.so.0
#10 0x0805ea46 in main (argc=1, argv=0xbffff774) at main.c:657
#11 0x40514647 in __libc_start_main (main=0x805e70c <main>, argc=1, 
    ubp_av=0xbffff774, init=0x8056c70 <_init>, fini=0x80a78d0 <_fini>, 
    rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff76c)
    at ../sysdeps/generic/libc-start.c:129
(gdb) 


HTH

Comment 29 Kjartan Maraas 2002-08-24 21:41:23 UTC

I think this should be closed out here. It's not Red Hat specific. The only clue
I've found is that commenting out client_list_sync() in gwmh.c and just making
it return TRUE makes the crash go away.

Comment 30 Havoc Pennington 2002-12-14 02:46:21 UTC

As we never figured this out and we are unlikely to do a 7.x update 
at this point anyway, closing bug.

Note You need to log in before you can comment on or make changes to this bug.