Bug 87544 - Logout with 3D enabled crashes system
Summary: Logout with 3D enabled crashes system
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: XFree86
Version: 9
Hardware: i386
OS: Linux
high
medium
Target Milestone: ---
Assignee: Mike A. Harris
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-03-28 16:07 UTC by Michael Roth
Modified: 2007-03-27 04:02 UTC (History)
3 users (show)

Fixed In Version: 4.3.0-5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-04-25 08:04:03 UTC
Embargoed:


Attachments (Terms of Use)
X server log (31.38 KB, text/plain)
2003-03-29 14:44 UTC, Michael Roth
no flags Details
My X server config (3.13 KB, text/plain)
2003-03-29 14:45 UTC, Michael Roth
no flags Details

Description Michael Roth 2003-03-28 16:07:18 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

Description of problem:
I'm using RedHat 8.1 beta (phoebe).  When 3D hardware acceleration is enabled, 
logging out in ANY window manager produces a blank black screen, and forces me 
to press the power button on my computer.

When I disable 3D hardware acceleration, logout works perfectly.

Version-Release number of selected component (if applicable):
Pheobe

How reproducible:
Always

Steps to Reproduce:
1. Enable 3D hardware acceleration on Phoebe
2. Log out of your user name
3. A blank screen will result, forcing you to turn off and restart the system
    

Additional info:

Comment 1 Mike Chambers 2003-03-29 00:18:31 UTC
Please update, or do a fresh install of Red Hat Linux 9 once it's released to
you (on 3/31 if you have a paid RHN subscription, or on 4/7 if not) and see if
the problem still exists.

Comment 2 Mike A. Harris 2003-03-29 09:35:51 UTC
Enable 3D acceleration, and completely disable your kernel sound
card drivers from loading at boot time, then reboot.  Does the problem
go away when sound card support is not configured?

Also, in either scenario, you haven't mentioned _what_ hardware you are using,
or what driver.  Please attach your X server log and config file to all
XFree86 bug reports as individual separate uncompressed file attachments.

Note: It is not Red Hat Linux 8.1 beta.  There is no 8.1 and there never
was a plan for there to be an 8.1.  It is "Red Hat Linux 9", and will
be available on the dates Mike Chambers mentioned above.


Comment 3 Michael Roth 2003-03-29 14:44:15 UTC
Created attachment 90784 [details]
X server log

Here's my X server log.

Comment 4 Michael Roth 2003-03-29 14:45:25 UTC
Created attachment 90786 [details]
My X server config

Comment 5 Michael Roth 2003-03-29 14:48:40 UTC
I currently have 3D hardware acceleration enabled, and I disabled the sound server from 
loading (using the Red Hat control center).  Logging in and out works now.  I have attached my 
X server log and my X server config. 
 
I'm using an ATI Radeon VE (32MB) video card.  I apologize for the lack of info - this is the 
first bug I've submitted to Red Hat. 

Comment 6 Mike A. Harris 2003-03-30 19:08:02 UTC
If you disable sound, and you can now log in and out, then this definitely
is not a video problem.  It is a sound card driver problem, or broken sound
hardware.

What sound hardware do you have?  Please provide the output of: "lspci"
and "lspci -vv"

Comment 7 Mike A. Harris 2003-04-06 05:44:10 UTC
Other users have reported problems on logouts from X hanging, which go away
when DRI is disabled also, which we have traced to being related to their
sound card conflicting with the video hardware.

I do not recall wether the problem was a kernel sound card driver bug, or
if it was a BIOS bug, or something that there is a workaround for or not.

I've cc'd Alan to see if he's got comments to add, or may have questions to 
ask which might help.

Comment 8 Alan Cox 2003-04-06 14:55:57 UTC
I am not aware of any such tracing, nor of any community belief that this
exists. At least on the beta the X server on my radeon crashes the box if it
resets when the last client quits instead of a session being killed off. Friends
see the same with some non radeon servers too.



Comment 9 Alan Cox 2003-04-06 15:04:27 UTC
Ok I think I see what is going on. The X DRI layer is enabling the Vblank IRQ on
the video card. When the X server exits cleanly (as opposed to aborts) it does
not seem to turn it back off.
If the IRQ is shared with the sound (legal) or anything else -> dead box.

You need to fix the DRM kernel module cleanup paths. It looks like its a generic
DRM bug although probably only bites i8xx and radeon. 

I can't prove this but I suspect this also explains the crash with radeon when
an OpenGL app is
killed occasionally and on some specific hw.


Comment 10 Mike A. Harris 2003-04-06 17:12:51 UTC
What I was refering to, is the problem that Ali-Reza Anghaie reported
on our internal private beta tester mailing list which was tracked down
to being related to his BIOS messing up the sound card on APM resume.  I
recalled the sound card being involved, but not the specific details
of that situation.  I just hunted the emails down though to recall the
specifics.  It seems now that that issue is different from this one as
it was related to APM suspend/resume.

I think an email from Linus a few days ago to dri-devel might explain
this.  I'll attach it below.

Comment 11 Mike A. Harris 2003-04-06 17:14:38 UTC
Actually, I'll attach various relevant parts of the whole thread for context:

Date: Fri, 4 Apr 2003 22:12:39 +0300 (EEST)
From: Panagiotis Papadakos <papadako.gr>
To: dri-devel.net
Content-Type: TEXT/PLAIN; charset=US-ASCII
List-Id: <dri-devel.lists.sourceforge.net>
Subject: MGA and lockups during shutdown or switchmode
 
For some months now I am experiencing lockups when I switched to the VTs,
or changed the video modes or if I tried to shutdown the Xserver.
 
So I applied the following patch, after looking the related radeon patch
and now I can switch to the VTs or change the videomode without lockups.
But when I press Ctrl+Alt+Delete, sometimes my machine will lockup before
kdm starts a new Xserver or it will lockup right away after my monitor
has received the signal from the new Xserver.
 
If I kill the kdm process and then restart it everything will be ok. (At
least when I tried it)
 
So can anyone please help?
 
This is the patch:
 
--- mga_dri.c   2003-04-04 22:02:21.000000000 +0300
+++ mga_dri.c_new       2003-04-04 16:26:31.000000000 +0300
@@ -1359,6 +1359,7 @@
    if (pMga->irq) {
       drmCtlUninstHandler(pMga->drmFD);
       pMga->irq = 0;
+      pMga->reg_ien = 0;
    }
 
    /* Cleanup DMA */

Comment 12 Mike A. Harris 2003-04-06 17:15:21 UTC
Date: Sat, 5 Apr 2003 13:17:08 -0800 (PST)
From: Linus Torvalds <torvalds>
To: Michel Dänzer <michel>
Cc: Eric Anholt <eta>, Panagiotis Papadakos <papadako.gr>,
DRI <dri-devel.net>
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
List-Id: <dri-devel.lists.sourceforge.net>
Subject: Re: MGA and lockups during shutdown or switchmode
 
 
On 4 Apr 2003, Michel Dänzer wrote:
> > But why does this cause a hang?
>
> I'm not sure, maybe some kernels and/or machines don't like the
> interrupt being enabled without the handler being installed. I couldn't
> reproduce the problem on my Macs.
 
Let's walk through it. First, the setup conditions:
 
 - let's say you have your graphics card on irq X.
 
 - it so happens that the network controller is _also_ on irq X, and the
   network driver has registered a interrupt handler for that irq.
 
Ok. Now look at what happens in two cases:
 
Case 1: enable interrupts before installing the handler
 
 - no display irq handler, but interrupt X comes in.
 
 - the irq dispatch sends the interrupt to the network driver, which has
   told it that it is interested in irq X.
 
 - the network driver looks at its hardware state, and most likely returns
   without doing anything, since the network card told it that there was
   nothing to do.
 
 - the interrupt dispatch returns.
 
 - MACHINE GOES BOOM AND HANGS! PCI interrupts are level-triggered, so the
   hardware will re-assert the interrupt that is still pending , and we
   will forever re-do this endless loop of calling the network driver,
   which will not do anything. Nothing will ever acknowledge the
   interrupt, and nothign will ever make it go away. And since the CPU is
   busy taking interrupts, the code sequence that _caused_ the lock-up in
   the first place will also never make any progress, and will never get
   to the point where it registers the video card interrupt handler.
 
Now, look at case 2:
 
Case 2: the driver installs the handler before it enables video card
interrupts:
 
 - irq X comes in, the irq dispatch logic dispatches it to both the
   network driver and the the DRI driver. The network driver still has
   nothing to do (but the dispatcher can't know that - irq X is irq X, and
   there's no way to know them apart from a higher level), but the DRI
   driver will do the right thing AND THEN IT WILL ACK IT SO THAT IT SHUTS
   UP!
 
 - the irq dispatch returns, and things continue happily onward.

See? IT IS A *MAJOR* BUG TO ENABLE INTERRUPTS BEFORE YOU HAVE INSTALLED
THE HANDLER FOR IT.
 
(And yes, it sometimes just happens to work by pure luck. Maybe you don't
have any pending events, so enabling interrupts won't _do_ anything.
This is possibly why things hang on the _restart_ - because the first
invocation of X may have left stuff pending).
 
It also just "happens" to work if you don't have shared interrupts
(because then the irq dispatch logic will just mask off the irq totally
since there are no interrupt handlers at all for it). Again, that's just
pure luck, and nothing else. It's still a totally and utterly broken
driver, now it's just that it happens to work perfectly fine on some
hardware.
 
                Linus

Comment 13 Alan Cox 2003-04-06 17:17:43 UTC
Yes this looks the relevant stuff. I'm not sure only mga needs fixing tho 8(


Comment 14 Mike A. Harris 2003-04-13 01:26:15 UTC
*** Bug 88748 has been marked as a duplicate of this bug. ***

Comment 15 Mike A. Harris 2003-04-13 01:31:04 UTC
Ignore that dupe, made a typo.

Comment 16 Mike A. Harris 2003-04-25 08:04:03 UTC
XFree86-4.3.0-5 has a workaround for this until the kernel gets fixed
properly.


Note You need to log in before you can comment on or make changes to this bug.