Bug 54625 - ATI Rage M4 + Dell Inspiron 8000 -> White Screen of Death
Summary: ATI Rage M4 + Dell Inspiron 8000 -> White Screen of Death
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: XFree86
Version: 7.3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Mike A. Harris
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-10-14 03:51 UTC by Stephen John Smoogen
Modified: 2007-04-18 16:37 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2002-12-30 19:39:22 UTC
Embargoed:


Attachments (Terms of Use)
Working startx.out (11.78 KB, text/plain)
2001-10-24 19:17 UTC, Stephen John Smoogen
no flags Details
Current startx.out (11.52 KB, text/plain)
2002-02-27 15:19 UTC, Stephen John Smoogen
no flags Details

Description Stephen John Smoogen 2001-10-14 03:51:11 UTC
I want to apologize profusely to the QA guys at RH. This should have
been an early test machine for 7.2, but due to it being my wife's mobile
work-box.. I didnt want to endanger it.

I have a Dell Inspiron 8000 with the ATI Rage Mobility M4 chipset. With
7.1, I had an occasional problem where when the X session ended the
machine would go into a 'white screen' mode where everything was whited
out and keyboard was dead. The machine could be ssh'd into but killing
the X session wouldnt bring it back, and you could not always reboot it
cleanly.. it would hang somewhere in the last processes. One could also
replicate the problem by just CNTRL-ALT-F1 and the 'video switch' will
bring up the white screen.

With 7.2, this seems to be much much worse. I able able to end X safely,
maybe 1 out 5 times without a white screen. I have been able to work
around this twice by pressing the <Function> and <Font> key at the same
time, but the rest of the times it is still unusable.

I edited the /etc/X11/XF86config-4 file and turned off every option in
the file i could (DRI, double-buffer, etc etc). This did not have any
noticeable effect... I also tried changing to 8 bit and 24 bit mode. The
24 bit mode had the effect of the Xserver going into vertically split
mode at 50% across the screen.. but otherwise did not effect the
percentage of white screen crashes (~80%). Using different screen modes
from 1280x1024 down to 640x480 also did not make it less likely to
occur.

At the moment, I am down to trying getting Frame Buffer going since I
dont have this problem during an install or upgrade. I would also like
any advice from Dell, XFree86, or kernel people on what I can try.

I can tinker with the box a bit more at the moment.. since well until
its fixed I am to live in the dog house.

Further info:
BIOS APM set to using 8 megs of ram.
Dell Inspiron has 128 Megs Ram

Comment 1 Mike A. Harris 2001-10-21 21:50:31 UTC
>BIOS APM set to using 8 megs of ram.

You mean AGP aperture?  If so, the AGP aperture should be set to 1/2
of the system's RAM.  So in your case 64Mb.  Not likely the cause
of the problem but...

I assume you're using runlevel 3, but if not, can you try it?  When
this problem occurs, can you blindly type "startx" again, and see
if X starts up ok?  If it does, this is most likely related to the
VTSWITCH bugs people have been experiencing.

Comment 2 Stephen John Smoogen 2001-10-23 02:37:45 UTC
8 MB was the amount of memory reported by the BIOS. It isnt settable so I am
guessing this is the amount that the card has hard-wired to it.

The keyboard rarely is operational when the problem occurs. I am going to make
sure that HOT-Keys are enabled in the kernel so that I can try to get it out of
RAW mode if that is what is happening.

I did notice one thing.. it does seem to be in some weird video state.. at first
you think you can see the X and some other things.. but after a while the video
goes to either complete white or gray and white lines.

Hmmm actually this looks to be kernel level. The box falls off the net when this
occurs.. my ssh session is locked up and gone. The box seems to make one last
sync of discs and that is it. [ALT-SYSRQ keys od not respond etc.] 

I am going to set up runlevel 4 to have no daemons (apm et al) and then boot
into it. I will try startx there.. if it doesnt occur.. I will go into runlevel
3 which will have apmd et al on.. and then see if the problem occurs.

It being late.. I will try it in the morning so I dont screw something up :)

Comment 3 Stephen John Smoogen 2001-10-23 13:15:08 UTC
Ok the problem really seems to be some interaction between the kernel and X. The
box is wedging real hard and I cant see what is causing it.. I am going to see
if I can bring it in so someone at RH can put a serial console to it. I will
also upgrade the kernel to 2.4.9-X to see if that helps any.. (if I can get a
download to work :))

Stephen

Comment 4 Stephen John Smoogen 2001-10-24 18:36:08 UTC
Ok I have updated to kernel-2.4.9-7 and XFree86-4.1.0-4. The machine continues
to hard-lock at certain changes between X->non-X though with less frequency. I
am going to downgrade the kernel from i686 version to i386 version and see if
that helps any.. after that its serial console and looking for any tips that
might help.


Comment 5 Stephen John Smoogen 2001-10-24 19:17:40 UTC
Created attachment 34901 [details]
Working startx.out

Comment 6 Stephen John Smoogen 2001-10-24 19:18:53 UTC
Ok changing to i386 kernel seems to have helped (restarting and stopping X 5
times without a crash is a big bonus.) I am looking at differing messages in
/var/log/messages from the agpgart module.. and am wondering if this might point
to the issue.

Here are i686 messages...
messages:Oct 22 22:42:18 localhost kernel: Linux agpgart interface v0.99 (c)
Jeff Hartmann
messages:Oct 22 22:42:18 localhost kernel: agpgart: Maximum main memory to use
for agp memory: 94M
messages:Oct 22 22:42:18 localhost kernel: agpgart: agpgart: Detected an Intel
i815, but could not find the secondary device. Assuming a non-integrated video card.
messages:Oct 22 22:42:18 localhost kernel: agpgart: Detected Intel i815 chipset
messages:Oct 22 22:42:18 localhost kernel: agpgart: AGP aperture is 64M @ xe4000000
messages:Oct 22 22:42:18 localhost kernel: agpgart: AGP mode is 1x


Here are the i386 2.4.9-7 messages...

messages:Oct 24 14:55:07 localhost kernel: Linux agpgart interface v0.99 (c)
Jeff Hartmann
messages:Oct 24 14:55:07 localhost kernel: agpgart: Maximum main memory to use
for agp memory: 94M
messages:Oct 24 14:55:07 localhost kernel: agpgart: agpgart: Detected an Intel
i815, but could not find the secondary device. Assuming a non-integrated video card.
messages:Oct 24 14:55:07 localhost kernel: agpgart: Detected Intel i815 chipset
messages:Oct 24 14:55:07 localhost kernel: agpgart: AGP aperture is 64M @ 0xe4000000

The startx.out is included as an attachment.

Comment 7 Mike A. Harris 2001-11-07 17:22:55 UTC
That just means you have the i815 chipset but not its integrated video.
If you arent already, try upgrading to our latest kernel.  There was
recently some kernel level console switching bugs fixed.  If that doesn't
do it, we'll have to explore something else.

Comment 8 Mike A. Harris 2002-02-09 14:33:40 UTC
Any word on if the new XFree86 + kernel update fixes this?

Comment 9 Stephen John Smoogen 2002-02-11 03:11:53 UTC
Have upgraded to latest errata i386 (versus 686) of kernel and glibc. I have
also upgraded to latest XFree86. I have run for 8 hours and not had a lockup
leaving X. I will pound on it for a couple more days and either give more info
or close ticket. Or if adventurous... try moving it to 686 versions of kernel
and glibc.

One problem after updates... I have found is that even though I said in
Xconfigurator that my inspirons LCD was Generic 1280x1024, X starts up and says
it doesnt know what that is and goes to 1024x768.

Comment 10 Stephen John Smoogen 2002-02-27 15:19:56 UTC
Created attachment 46780 [details]
Current startx.out

Comment 11 Stephen John Smoogen 2002-02-27 15:33:59 UTC
I have had lockups again with the system in the last 2 days. At first I thought
it was sound oriented, but turning off all sound etc didnt help any. I can get
the system to lockup by doing the following:

Cause system to fsck.
Log in as user
Startx
Turn to ALT-Fx or logout of KDE. 
Instant lockup.

So the problem is something to do with kernel buffer cache?

Thanks for bearing with me.

Comment 12 Stephen John Smoogen 2002-03-11 18:50:04 UTC
Ok, I found a way to not cause the white screen of death after an fsck on the
Inspiron. 

startx
opened up an xterm
su - root
/etc/init.d/apmd stop
end X session

system recovered normally 2 out of 2 times :). Hope this helps figure out where it 
might be crashing.


Comment 13 Stephen John Smoogen 2002-03-25 17:50:35 UTC
Wanted to update bug. The problem still occurs if the 'APM' on the system has
been 'activated' (battery low, close case, manually put system into sleep.) If
Dell has anything that I can try in the BIOS (update it?) I will glady do so.

Comment 14 Mike A. Harris 2002-07-26 16:18:08 UTC
Please look at bug #65136 and see if it sounds similar.  The bugfix for
that bug will be available soon.  This bug should be closed as a duplicate
of that bug if they are the same.

Comment 15 Stephen John Smoogen 2002-08-27 12:51:30 UTC
Ack. Well I thought it was closable. This bug should be considered 72672 now I
think.

Comment 16 Stephen John Smoogen 2002-12-30 19:39:22 UTC
I have only had an occasional lockup after going to the 2.4.18-18.x kernel and
8.0. I am going to install 8.0.93 and see if the problem shows up there. If it
does, then I will open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.