Bug 47799 - XFree 4.0.3 hung switching wmaker workspaces; gdb attached; had to poweroff to get X back
Summary: XFree 4.0.3 hung switching wmaker workspaces; gdb attached; had to poweroff t...
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: WindowMaker
Version: 7.3
Hardware: i386
OS: Linux
Target Milestone: ---
Assignee: Phil Copeland
QA Contact: David Lawrence
Depends On:
TreeView+ depends on / blocked
Reported: 2001-07-07 18:45 UTC by Telsa Gwynne
Modified: 2007-04-18 16:34 UTC (History)
1 user (show)

Clone Of:
Last Closed: 2002-02-14 10:46:36 UTC

Attachments (Terms of Use)
gdb of X crash. (13.35 KB, text/plain)
2001-07-07 18:47 UTC, Telsa Gwynne
no flags Details
lspci -vv (5.81 KB, text/plain)
2001-07-07 18:48 UTC, Telsa Gwynne
no flags Details
/etc/X11/XF86Config-4 (2.44 KB, text/plain)
2001-07-07 18:50 UTC, Telsa Gwynne
no flags Details

Description Telsa Gwynne 2001-07-07 18:45:06 UTC
Description of Problem:

As per summary. As near as I can recall, I was switching workspaces in
windowmaker. It was part-way through drawing the display. The Gnome
panel and its contents were drawn, except that the cpuload monitor
'stopped'; the gnome-terminal contents (prompt and so on) were drawn
as were the toolbar within gnome-terminal. The  g-t titlebar was empty of
title. The two 'buttons' on the titlebar were there, but the contents
(shaded window on left, cross on right) weren't.

ssh'd in, ran gdb on the X process, got a list of all things running.

Comment 1 Telsa Gwynne 2001-07-07 18:47:30 UTC
Created attachment 22947 [details]
gdb of X crash.

Comment 2 Telsa Gwynne 2001-07-07 18:48:50 UTC
Created attachment 22948 [details]
lspci -vv

Comment 3 Telsa Gwynne 2001-07-07 18:50:31 UTC
Created attachment 22949 [details]

Comment 4 Telsa Gwynne 2001-07-07 21:54:24 UTC
Apparently I need to add this.

Rebooted the machine with reboot -f. Logged back in as same (non-root)
user who started X. "startx" resulted in the screen changing as it if
were about to start drawing the Gnome splash screen and then nothing.
Couldn't controlt-alt-F1. So ssh in again, and X is doing this
 1329 root      17   0 10532 6968   308 R    99.6  5.5  13:42 X

Rebooted and actually powered down this time, which I hadn't done last time,
tried again, and X finally started via startx.

I'm told you only need the attachments; if there is more machine info you
need, let me know.

Comment 5 Glen Foster 2001-07-13 22:16:31 UTC
This defect considered SHOULD-FIX for Fairfax gold-release.

Comment 6 Mike A. Harris 2001-07-15 15:39:03 UTC
Please upgrade your X to the one in rawhide (4.1.0).  If this bug does not
occur, then I consider it fixed WRT Fairfax because 4.0.3 is not shipping in
Fairfax.  Please do this ASAP so this bug can get resolved ASAP.


Comment 7 Alan Cox 2001-07-15 16:39:07 UTC
Umm does this mean you are shipping no supported 3D acceleration. The last set
of DRI modules for 4.1 I looked at are both full of hard to read macro abuse and
appear to contain security holes fixed recently in 4.0.x

Also its Red Hat policy not to add API's that Linus hasnt accepted or approved
of. The new DRI does exactly that

Comment 8 Mike A. Harris 2001-07-16 06:33:25 UTC
That depends on the outcome of things.  XFree86 is a priority 2 package.
It was pulled out before beta1 because it wasn't reliably stable useable
packaging at the time and did not need to actually be in the distro before
beta2.  The existing packages are fairly stable and high enough quality
to be useable and stabilize in time for Gold IMHO.  So there was no reason
IMHO not to use 4.1.0.  Now that it _is_ in the distro, and it's package
version freeze is past due, and GNOME + KDE are built against it as is
roughly half of the distro, and QA processes applied against that, AFAIC
4.1.0 is locked-in.  The DRM code for 4.1.0 works on all cards I've tested
so far in my limited testing using my own built modules.  So I see no
reason that the new DRM should not go into our final release, wether it
be in the kernel RPM, XFree86 RPM or a separate RPM.

Whoever makes the decision to not ship working DRM certainly won't be me.
I'm not pleased that DRM is not backward compatible with 4.0.3, but I am
not in a position to change that as I'm neither a kernel coder, nor a DRM
hacker.  So as it stands, _someone_ other than me needs to decide the
final resolution on the DRM issues.

I've emailed Preston about it and detailing the pro's and cons of various
solutions.  I'm ready to make whatever changes we see fit in the timeframe
available that I am realistically capable of making as a non DRM hacker.

If we don't ship non-Linus approved API's then we either perpetually ship
4.0.3, we don't ship DRM, or we fix the code ourselves, the latter of which
nobody here to my understanding has the required DRM experience/knowledge.
Also, the latter would mean we are shipping something potentially not
compatible with the rest of the community and are doing "that Red Hat only
thing".   I might also add that CIPE, and numerous other parts of our kernel
are not Linus approved either, yet we do ship them, so that IMHO is not
an issue.

I do not know what the proper/best solution is.  I am definitely
interested in suggestions though.

Comment 9 Alan Cox 2001-07-16 11:31:50 UTC
CIPE is a bit different since it merely adds a device, it doesnt break
compatibility.  "I downloaded Linus kernel and now Red Hat X servers dont work"

From casual inspection the 4.0.x DRM holes are in 4.1 as well so either way they
need fixing. Hopefully someone can work out enough of the 4.1 modules to
backport the fixes

Also btw IMHO if we do go with the 4.1 DRM we should split the DRM kernel
modules package from the base kernel. We need to do an errata kernel for 7.1
again at some point and building both DRM 4.0 and DRM 4.1 as two kernel-drm-
packages will keep that a lot saner 8)

Comment 10 Mike A. Harris 2001-07-25 03:49:32 UTC
Please test the latest XFree86 4.1.0 packages to see if that fixes
the problem for you.

Comment 11 Telsa Gwynne 2001-09-08 21:54:50 UTC
I hate to do this to you so late. I have seen no recurrence of this problem
on the box I originally reported it on. 

After rearranging machines,I started using a new box with RC2 on it. This
bug or something very like it seems to be present still. I did not get the
redrawn gnome-terminal with no contents this time. Everything stopped, and
I couldn't ssh in to kill X because the box was down. No response to ping.
The crash has only happened once. Related might be another issue when 
changing workspaces: 

Over the last couple of days I have been using it (after updating from an
earlier beta with up2date), I have noticed a couple of times on flipping
workspaces with alt-1, alt-2 etc (wmaker's default workspace keybindings),
that it ... well. It doesn't hang. If I flip through fast, it just stops
responding to the flipping and stays on the workspace it's got to. I can
get around this by selecting a workspace on the gnome deskguide applet. 
That takes me to the workspace and things start to work again then.

I can't reproduce it to order. The only thing I see is very subjective. 
I have a gnome cpu_load monitor applet running, and on this newer faster
machine with a new monitor, it spikes when changing workspaces far more
than on the old one. This seems odd to me, but I don't know how relevant
it is. 

What information do you need? I'm not even sure whether this is X (which
crashes) or wmaker (which seems to be the cause). It could even be the
deskguide from Gnome, which had quite a penchant for "hello, window, 
tell me about yourself -- window? hello? oh dear [bang]" at one stage,
although I've never seen it crash X before. This machine is an intel
810 with a Phillips Brilliance (hah) 107 monitor. 
(plus a ton of other gnome stuff)

Do you need lspci -vv again? What else?

Comment 12 Jay Turner 2002-02-13 20:21:36 UTC
Are any of these issues resolved with XFree86-4.2?

Comment 13 Telsa Gwynne 2002-02-14 10:42:54 UTC
I think I know what's causing the second issue (sometimes
*refusing* to flip workspaces): it's wmaker, and I've filed
a separate bug on that. 

I do not yet have X 4.2. I had forgotten all about this bug
because it now doesn't happen for me here. You can close it
and I will reopen it if it recurs when I get 4.2? Or leave it
and I'll update the bug when I get the new packages?

Comment 14 Mike A. Harris 2002-02-14 22:27:27 UTC
Ok, thanks for the update Telsa.  I'll close it for now as RAWHIDE,
but if you have any problems still, reopen it and we'll have another
look-see into it.  There are a few i810 fixes in our packages also
above the stock 4.2.0 driver, which may solve other lockup issues as

Note You need to log in before you can comment on or make changes to this bug.