Bug 119705

Summary: (I845G DRI) i830_wait_ring crash caused by screensaver on i845
Product: [Fedora] Fedora Reporter: Dave Goldblatt <daveg>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: chak, fedora, howard-redhat, jdennis, jsd, nmarsh1, paul.0000.black, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-05-05 10:11:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 114963    

Description Dave Goldblatt 2004-04-01 17:14:57 UTC
With kernel < kernel-2.6.4-1.300 the GL screensavers would trigger an assert (bug 
117713).  With kernel 2.6.4-1.300 the X server crashes and gets stuck in a restart loop.

Only way to recover X is to reboot (!). 

Version-Release number of selected component (if applicable):
xorg-x11-0.0.6.6

How reproducible:
Always

Steps to Reproduce:
Run xscreensaver-demo on an i830-based platform (i845 in this case) and choose one of 
the GL screensavers.  X crashes and goes into a restart/fail loop. 

Additional info:

From XFree86.0.log.old:

Error in I830WaitLpRing(), now is 9184, start is 7183
pgetbl_ctl: 0x17fe0001 pgetbl_err: 0x49
ipeir: 0 iphdr: 431a0759
LP ring tail: 1a0 head: 0 len: 1f001 start 0
eir: 0 esr: 11 emr: ff7b
instdone: ffc1 instpm: 0
memmode: 0 instps: 40
hwstam: ffff ier: 0 imr: ffff iir: 0
space: 130648 wanted 131064

FatalError re-entered, aborting
lockup

From messages:
[...]
Apr  1 12:05:39 dgoldblatt-2k kernel: mtrr: base(0xe8020000) is not aligned on a 
size(0x300000) boundary
Apr  1 12:05:46 dgoldblatt-2k kernel: [drm:i830_wait_ring] *ERROR* space: 130648 
wanted 131064
Apr  1 12:05:46 dgoldblatt-2k kernel: [drm:i830_wait_ring] *ERROR* lockup
Apr  1 12:05:48 dgoldblatt-2k gdm[3475]: gdm_slave_xioerror_handler: Fatal X error - 
Restarting :0
Apr  1 12:05:49 dgoldblatt-2k kernel: mtrr: base(0xe8020000) is not aligned on a 
size(0x300000) boundary
Apr  1 12:05:56 dgoldblatt-2k kernel: [drm:i830_wait_ring] *ERROR* space: 130648 
wanted 131064
Apr  1 12:05:56 dgoldblatt-2k kernel: [drm:i830_wait_ring] *ERROR* lockup
Apr  1 12:05:58 dgoldblatt-2k gdm[3480]: gdm_slave_xioerror_handler: Fatal X error - 
Restarting :0
Apr  1 12:06:01 dgoldblatt-2k kernel: mtrr: base(0xe8020000) is not aligned on a 
size(0x300000) boundary
Apr  1 12:06:08 dgoldblatt-2k kernel: [drm:i830_wait_ring] *ERROR* space: 130648 
wanted 131064
Apr  1 12:06:08 dgoldblatt-2k kernel: [drm:i830_wait_ring] *ERROR* lockup
Apr  1 12:06:10 dgoldblatt-2k gdm[3485]: gdm_slave_xioerror_handler: Fatal X error - 
Restarting :0
[etc]

Comment 1 Manuel "Chilli" Chakravarty 2004-04-07 11:54:56 UTC
I am seeing the exactly same problem with an i855GME using kernel
2.6.4-1.305 with xorg-x11-0.6.6-0.2004_03_30.1.  BTW, an easier method
to trigger the bug is to start "glxgears".

Moreover, the same behaviour has been reported as a follow on to the
solution of #117713.

Comment 2 Manuel "Chilli" Chakravarty 2004-04-08 16:00:44 UTC
The bug seems to be resolved in xorg-x11-0.6.6-0.2004_03_30.5 with
kernel-2.6.5-1.308.  I tested "glxgears" and the "GL ForestFire"
screen saver with the new package and both worked fine.  Inspection of
the log of the X server showed that GLX (with DRM) was enabled - in
addition to the frame rate being appropriate.

Comment 3 Dave Goldblatt 2004-04-12 15:16:39 UTC
Confirmed working with xorg-x11-0.0.6-0.2004_03_30_5 and kernel 2.6.5-1.309 from 
rawhide.


Comment 4 Paul Black 2004-04-13 10:20:46 UTC
The "current" rawhide SMP kernel (kernel-smp-2.6.5-1.315) doesn't work
for me (with xorg-x11-0.0.6-0.2004_03_30_5). With
kernel-smp-2.6.4-1.302, when I run jigglypuff (!)  I get the kernel
drm messages as above. With the latest kernel, I get a complete
lock-up with no useful output. The machine responds to ping but I
can't log in from a remote machine.


Comment 5 Dave Goldblatt 2004-04-14 20:09:02 UTC
Confirmed failures again with 315, 319, and 322 (after run of hypertorus screensaver).

Comment 6 Paul Black 2004-04-15 07:29:06 UTC
I suspect that this is now a kernel issue as it causes ssh connections
into my machine to hang as well.It's not a complete kernel crash as
ping still works. Same symptoms present on a non-SMP kernel.



Comment 7 Nick Marsh 2004-04-22 01:04:47 UTC
Recreated on FC2T2 xorg-x11-0.0.6.6-0.0.2004_03_11.9 and
kernel-2.6.3-2.1.253.2.1

Fatal server error:
Caught signal 11.  Server aborting

# lspci
00:00.0 Host bridge: Intel Corp. 82850 850 (Tehama) Chipset Host
Bridge (MCH) (rev 02)
00:01.0 PCI bridge: Intel Corp. 82850 850 (Tehama) Chipset AGP Bridge
(rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to
PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 02)
00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 02)
00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2
MX/MX 400] (rev a1)
02:09.0 Multimedia video controller: Brooktree Corporation Bt878 Video
Capture (rev 11)
02:09.1 Multimedia controller: Brooktree Corporation Bt878 Audio
Capture (rev 11)
02:0a.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1
(rev 08)
02:0a.1 Input device controller: Creative Labs SB Live! MIDI/Game Port
(rev 08)
02:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado]
(rev 78)

Comment 8 Arjan van de Ven 2004-05-03 14:43:02 UTC
please try kerneml 349 on
http://people.redhat.com/arjanv/2.6/ 
which has a bugfix in this area

Comment 9 Dave Goldblatt 2004-05-03 15:46:22 UTC
Kernel 349 failed identically.


Comment 10 Arjan van de Ven 2004-05-03 15:49:22 UTC
Identical means "with waitring message" or otherwise please specify

Comment 11 Dave Goldblatt 2004-05-03 16:11:03 UTC
Sorry, incomplete post  - identical to response #4, in that no kernel messages, no X log 
message, just a total lockup.  309 is the last kernel not to exhibit this behavior.

Comment 12 Arjan van de Ven 2004-05-04 17:14:08 UTC
http://people.redhat.com/arjanv/2.6/

the 350 kernel works on my i865 machine (uses the same drivers), 349
and before hung hard...


Comment 13 Dave Goldblatt 2004-05-04 18:47:08 UTC
The 350 kernel works for me..

Comment 14 Manuel "Chilli" Chakravarty 2004-05-05 00:59:14 UTC
On my i855GME machine kernel, 350 also fixes the problem.  I tested
both glxgears and the GLForestFire screen saver.  Both work perfectly
with hardware accelaration.

I am glad this got fixed before the release of FC2.  Thanks!

Comment 15 David Finch 2004-06-08 04:17:10 UTC
I've got an i845gv, running Fedora Core 2 (Kernel 2.6.5-1.358), and it
still crashes for me. I test by previewing every GL screensaver a
couple times. After several tries it crashes, more or less depending
on which screensavers. 

Xorg.0.log.old:
Error in I830WaitLpRing(), now is 5777, start is 3776
pgetbl_ctl: 0x1ffe0001 pgetbl_err: 0x49
ipeir: 0 iphdr: 7f200297
LP ring tail: 30 head: 0 len: 1f001 start 0
eir: 0 esr: 10 emr: ff7b
instdone: 6ac1 instpm: 0
memmode: 0 instps: 53
hwstam: ffff ier: 0 imr: ffff iir: 0
space: 131016 wanted 131064

Fatal server error:
lockup

messages:
Jun  7 20:46:10 localhost kernel: [drm:i830_wait_ring] *ERROR* space:
128484 wanted 131064
Jun  7 20:46:10 localhost kernel: [drm:i830_wait_ring] *ERROR* lockup
Jun  7 20:46:15 localhost kernel: [drm:i830_wait_ring] *ERROR* space:
128468 wanted 131064
Jun  7 20:46:15 localhost kernel: [drm:i830_wait_ring] *ERROR* lockup
...



Comment 16 Howard Holm 2004-07-18 21:30:36 UTC
This bug shouldn't be closed.  This is the same bug as 127222 which is
reproducable in kernel 2.6.6-1.435.2.3.  At least it's reproducable
for me running boxed.