Bug 107176 - Complete Lockup on IBM Thinkpad T22 during X activity
Complete Lockup on IBM Thinkpad T22 during X activity
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: XFree86 (Show other bugs)
9
i686 Linux
medium Severity high
: ---
: ---
Assigned To: X/OpenGL Maintenance List
David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-10-15 13:43 EDT by Kwan Lowe
Modified: 2007-04-18 12:58 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-10-01 02:37:08 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Kwan Lowe 2003-10-15 13:43:19 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225

Description of problem:
During X activity (scrolling browser windows, moving windows in KDE, switching
between virtual desktops) the system will freeze completely (no
keyboard/mouse/video, Magic Sysreq doesn't do a thing, can't ping). 

Hardware components:
Video 
01:00.0 VGA compatible controller: S3 Inc. 86C270-294 Savage/IX-MV (rev 13)
(prog-if 00 [VGA])
    Subsystem: IBM ThinkPad T20
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
    Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
    Latency: 64 (1000ns min, 63750ns max), cache line size 08
    Interrupt: pin A routed to IRQ 11
    Region 0: Memory at f0000000 (32-bit, non-prefetchable) [size=128M]
    Expansion ROM at <unassigned> [disabled] [size=64K]
    Capabilities: [dc] Power Management version 1
        Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [80] AGP version 1.0
        Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
        Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>


The XF86Config file contains:

Section "Module"
    Load  "dbe"
    Load  "extmod"
    Load  "fbdevhw"
    Load  "glx"
    Load  "record"
    Load  "freetype"
    Load  "type1"
    Load  "dri"
EndSection

Section "Device"
    Identifier  "Videocard0"
    Driver      "savage"
    VendorName  "Videocard vendor"
    BoardName   "S3 Savage/IX"
    VideoRam    8192
EndSection

(please email if you need the full file).

I am reasonably certain that it is not a hardware problem since I can run
non-interactive processes that consume memory and CPU for hours without problem
on both RH9 and a dual-boot Win2KPRO.  These apps include POVRAY, DVDRIP and
various short but intensive C math applications. The machine can be otherwise
idle but doing something innocuous such as scrolling down a web page in either
Mozilla or Konqueror will cause the lockup.

I've disabled the SpeedStep and power saving features in the BIOS. This hang
occurs with both the stock updated kernel (2.4.20-20.9) and a custom kernel no
other changes but the CONFIG_APM_ALLOW_INTS set (it is unset in the default kernel).

Version-Release number of selected component (if applicable):
XFree86-4.3.2 kernel-2.4.20-20.9 

How reproducible:
Always

Steps to Reproduce:
1. Boot in runlevel 5 and login
2. Open X application such as Konqueror and load long page
3. Quickly scroll through document, switch desktops, etc..
    

Actual Results:  Machine completely froze - unable to ping, Magic SysReq
ignored, serial console dead, CTL-ALT-DEL does nothing, no screen updates, no HD
activity. 

Expected Results:  Machine should continue as normal.

Additional info:

Possibly related bugs: 28526 42592 

Hardware: 1024x768 display, 16-bit color.
Comment 1 Kwan Lowe 2003-10-15 16:04:01 EDT
It appears to be something with the S3 driver itself. Running "x11perf -all"
using the S3 driver locks the machine within a few minutes the first three times
that I tried. I've so far been running x11perf under the VESA driver for a
couple hours now and it's still going. 
Comment 2 Mike A. Harris 2003-10-15 17:10:07 EDT
Try using Option "noaccel" in the driver section.  Report back if that
works around the issue temporarily.  If so, comment out the noaccel option
and try the various XaaNo... options listed on the XF86Config manpage to
try to find the troublesome XAA primitive that is presumeably causing the
problem.

If you can isolate it, we might be able to fix it or work around it in
the driver.

TIA.
Comment 3 Kwan Lowe 2003-10-15 19:32:07 EDT
Yup, noaccel allowed the x11perf tests to complete. Trying the XaaNo.. options
now. This may take a while :D. 
Comment 4 Kwan Lowe 2003-10-16 15:09:59 EDT
Some more info:
Adding all the XaaNo options allows the tests to run to completion reliably (3
complete tests of 1 repetition each).

Commenting 'Option "XaaNoSolidBresenhamLine"' in my XF86Config definitely causes
x11perf to fail. Unfortunately, it appears that there's at least one more Xaa
accel that's problematic. 

I tried adding half the options and disabling half then running the test. It
failed so I reversed which ones were enabled. Still failed. Thinking it would
save time, I uncommented out all the XaaNo options and then tried commenting
them indidually. I could reliably complete the tests with the following Options
not enabled (i.e., the accel was enabled):

         # Option       "XaaNoSolidFillRect"
         # Option       "XaaNoSolidFillTrap"
         # Option       "XaaNoSolidHorVertLine"
         # Option       "XaaNoSolidTwoPointLine"

If I then comment out 'Option "XaaNoSolidBresenhamLine"' and disable all the
other accelerations the test failed consistently (complete lockup). Problem is
that if I enable all the rest of the accelerations but disable
SolidBresenhamLine the test still fails.  This will probably take another day or
so to determine which other accel is causing the hang.  On top of this, x11perf
does not seem to fail in the same place each time. 

So far I've not been doing a complete reboot between successful tests, only
dropping to init 3 then back to init 5.

I hope that sounds clearer to you :D
Comment 5 Mike A. Harris 2003-10-20 05:11:25 EDT
Ok, once you've narrowed down the specific options needed for stability,
I'll investigate having it default to being disabled for that specific
chip.

TIA
Comment 6 Kwan Lowe 2003-10-26 16:40:23 EST
OK, I believe that I've narrowed it down to this list of required NoXaa options:

	
	 Option	"XaaNoImageWriteRect"
	 Option	"XaaNoMono8x8PatternFillRect"
	 Option	"XaaNoScanlineCPUToScreenColorExpandFill"
	 Option	"XaaNoScreenToScreenCopy"
	 Option	"XaaNoSolidBresenhamLine"
	 Option	"XaaNoSolidFillRect"

Without these, "x11perf -all -repeat 1" will always fail. With these options it
seems completely stable. I've been running this configuration, launching several
x11perf sessions, dropping back and forth between the console and X, scrolling
windows, and generally just using X for two days now without any problems.
Comment 7 Mike A. Harris 2004-10-01 02:37:08 EDT
Since this bugzilla report was filed, there have been several major
updates to the X Window System, which may resolve this issue.  Users
who have experienced this problem are encouraged to upgrade to the
latest version of Fedora Core, which can be obtained from:

If this issue turns out to still be reproduceable in the latest
version of Fedora Core, please file a bug report in the X.Org
bugzilla located at http://bugs.freedesktop.org in the "xorg"
component.

Once you've filed your bug report to X.Org, if you paste the new
bug URL here, Red Hat will continue to track the issue in the
centralized X.Org bug tracker, and will review any bug fixes that
become available for consideration in future updates.

Note You need to log in before you can comment on or make changes to this bug.