440384 – PATCH: Any opengl use hardfreezes machine i386, intel 82865G

Bug 440384 - PATCH: Any opengl use hardfreezes machine i386, intel 82865G

Summary: PATCH: Any opengl use hardfreezes machine i386, intel 82865G

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	---
Assignee:	Dave Airlie
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	NeedsRetesting
Depends On:
Blocks:	F9KernelBlocker
TreeView+	depends on / blocked

Reported:	2008-04-03 09:11 UTC by Hans de Goede
Modified:	2013-01-10 04:38 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-04-24 01:28:52 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
xorg.conf used when generting xorg.0.log.with_config (572 bytes, text/plain) 2008-04-03 09:11 UTC, Hans de Goede	no flags	Details
xorg.log generated with the previously attached config (28.76 KB, text/plain) 2008-04-03 09:11 UTC, Hans de Goede	no flags	Details
xorg.log generated without any config (31.84 KB, text/plain) 2008-04-03 09:12 UTC, Hans de Goede	no flags	Details
PATCH: Any opengl use hardfreezes machine i386, intel 82865G (2.37 KB, patch) 2008-04-18 13:19 UTC, Hans de Goede	no flags	Details \| Diff
View All

Description Hans de Goede 2008-04-03 09:11:04 UTC

Description of problem:
Any opengl use (glxgears for example) hardfreezes machine, with a few
xorg version back I got a kernel panic (2 keyb leds blinking), with F-8 things
work perfect.

00:02.0 VGA compatible controller: Intel Corporation 82865G Integrated Graphics
Controller (rev 02)

Used versions:
mesa-libGL-7.1-0.21.fc9
xorg-x11-drv-i810-2.2.1-19.fc9
xorg-x11-server-Xorg-1.4.99.901-16.20080401.fc9
libdrm-2.4.0-0.10.fc9.i386
kernel-2.6.25-0.177.rc7.git6.fc9.i686

Comment 1 Hans de Goede 2008-04-03 09:11:04 UTC

Created attachment 300201 [details]
xorg.conf used when generting xorg.0.log.with_config

Comment 2 Hans de Goede 2008-04-03 09:11:40 UTC

Created attachment 300202 [details]
xorg.log generated with the previously attached config

Comment 3 Hans de Goede 2008-04-03 09:12:34 UTC

Created attachment 300203 [details]
xorg.log generated without any config

Notice the freezing also happens without any xorg.conf file present.

Comment 4 Hans de Goede 2008-04-03 09:13:53 UTC

This bug might be related to / a duplicate of:
bug 427643, also a 82865, but then mobile
bug 438054 and bug 440070, both 82845's

Comment 5 Joachim Frieben 2008-04-05 16:25:40 UTC

The same for current "rawhide" and an ATI X800. The bug might thus be
due to the recent DRI2 move and not depend on the chipset. Either the
desktop frezes completely or it freezes except for the mouse pointer.

Comment 6 Jesse Keating 2008-04-06 13:49:08 UTC

We just fixed some crashing stuff, please retry with
xorg-x11-server-1.4.99.901-17.20080401.fc9 or later.

Comment 7 Joachim Frieben 2008-04-07 12:20:02 UTC

Right, xorg-x11-server-1.4.99.901-17.20080401.fc9 restores
direct rendering capability again with my ATI Radeon X800.

Comment 8 Hans de Goede 2008-04-07 20:31:30 UTC

Unfortunately, even with xorg-x11-server-1.4.99.901-17.20080401.fc9, any OpenGL
usage will still hardfreeze my machine. Seeing how I had signs of a kernel panic
with a few (rawhide) versions back, I tried booting the latest 2.6.24 from F-8
updates-testing, with this OpenGL works fine!

So this might be an xorg bug, which only gets triggered when certain features
get enabled which cannot be enabled with the 2.6.24 kernel. Note I get a "ttm
not available, using classic mode" message with both the F-8 and the rawhide
kernel, so this is not caused by ttm, as that is not used with the (freezing)
rawhide kernel either.

Comment 9 Hans de Goede 2008-04-09 14:39:29 UTC

Some more input:
-amazingly enough compiz does work, but running glxgears, both with and without
 compiz hardfreezes the machine
-this still happens with:
 xorg-x11-server-*-1.4.99.901-18.20080401.fc9
 xorg-x11-drv-i810-2.2.1-20.fc9
 kernel-2.6.25-0.216.rc8.git7.fc9.i686
 mesa-libGL-7.1-0.21.fc9
 libdrm-2.4.0-0.10.fc9.i386

Comment 10 Hans de Goede 2008-04-18 10:24:53 UTC

I've hooked up a serial terminal and catched the kernel panic which happens when
the system freezes, the cause is a NULL pointer dereference in irq context, so
this clearly seems a kernel bug, changing component and assignee.

Here is a copy of the caught panic:
BUG: unable to handle kernel NULL pointer dereference at
008
IP: [<f89a5fec>] :i915:i915_vblank_tasklet+0x13e/0x65e
                         
Oops: 0000 [#1] SMP
                                                            
Modules linked in: fscher hwmon fuse ipv6 nf_conntrack_ipv4
xt_state nf_conntra]
                                                            
                   
Pid: 0, comm: swapper Not tainted
(2.6.25-0.234.rc9.git1.fc9.i686 #1)           
EIP: 0060:[<f89a5fec>] EFLAGS: 00010087 CPU: 0
                                 
EIP is at i915_vblank_tasklet+0x13e/0x65e [i915]
                               
EAX: 00000000 EBX: c0798f98 ECX: f785f800 EDX: 03cc0000
                        
ESI: f785f9e4 EDI: ef2ad460 EBP: c0798fac ESP: c0798ee8
                        
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
                                  
Process swapper (pid: 0, ti=c0798000 task=c07123a0
task.ti=c0747000)            
Stack: c17fc4d0 5357a358 00000007 00000000 c0798f98
00000000 c17f8a80 c0798f28  
       c041e466 c17fc480 f785f800 f6eca100 03cc0000
f785fea4 f6eca104 c17fc480  
       c0798f38 c041e4c1 f6eca000 f6e0b218 f6eca104
00000000 c17fc480 ef1b2e80  
Call Trace:
                                                            
       
 [<c041e466>] ? enqueue_entity+0x1a4/0x1c4
                                     
 [<c041e4c1>] ? enqueue_task_fair+0x3b/0x3f
                                    
 [<c041fe93>] ? try_to_wake_up+0x193/0x19d
                                     
 [<f89d40a8>] ? drm_lock_take+0x6c/0xbb [drm]
                                  
 [<f89d3826>] ? drm_locked_tasklet_func+0x5e/0x8c [drm]
                        
 [<c042b214>] ? tasklet_hi_action+0x5e/0xb3
                                    
 [<c042b879>] ? __do_softirq+0x79/0xe7
                                         
 [<c0407ddb>] ? do_softirq+0x74/0xb5
                                           
 [<c045cca3>] ? handle_fasteoi_irq+0x0/0xaf
                                    
 [<c042b681>] ? irq_exit+0x38/0x6b
                                             
 [<c0407ec8>] ? do_IRQ+0xac/0xc4
                                               
 [<c04065e7>] ? common_interrupt+0x23/0x28
                                     
 [<c043007b>] ? switch_uid+0x15/0x6a
                                           
 [<c0418c8b>] ? native_safe_halt+0x5/0x7
                                       
 [<c0404bf0>] ? default_idle+0x46/0x7e
                                         
 [<c0404baa>] ? default_idle+0x0/0x7e
                                          
 [<c0404b8a>] ? cpu_idle+0xa1/0xc1
                                             
 [<c061c1c5>] ? rest_init+0x49/0x4b
                                            
 =======================
                                                       
Code: 00 25 00 00 00 01 83 f8 01 19 c0 f7 d0 83 e0 04 8b 44
05 d4 2b 47 10 3d 0 
EIP: [<f89a5fec>] i915_vblank_tasklet+0x13e/0x65e [i915]
SS:ESP 0068:c0798ee8   
Kernel panic - not syncing: Fatal exception in interrupt

Comment 11 Hans de Goede 2008-04-18 12:07:01 UTC

Some more info:

This seemed to be happening at the machines at my work due to an /etc/drirc,
which is a leftover from long ago and which told mesa to always swap buffers on
vsync.

Running driconf and choosing "Always synchronize with vertical refresh" should
let you reproduce this on almost any intel machine.

This is partially good news, as this means that with the default settings this
won't happen unless the application asks for synchronize with vertical refresh.

Comment 12 Hans de Goede 2008-04-18 13:01:28 UTC

Note: I've found and fixed the problem now, patch + explanation next.

Comment 13 Hans de Goede 2008-04-18 13:19:48 UTC

Created attachment 302882 [details]
PATCH: Any opengl use hardfreezes machine i386, intel 82865G

The problem is that i915_vblank_tasklet() looks at vbl_swap->minor->master to
get a pointer to its private driver data, however, vbl_swap points to an entry
of the vbl_swaps list, and the only place where entries get added to this list
is i915_vblank_swap() and that doesn't fill the minor member of the vbl_swap
struct.

This is not really a problem as the private driver data is already available in

i915_vblank_tasklet(), so the attached patch removes the minor field from the
vbl_swap struct, and stops setting master_priv to
vbl_swap->minor->master->driver_priv, which isn't needed anways as its already
initialized and pointing to the master driver private data.

The patch also moves the initialization of sarea_priv and pitchropcpp higher as
gcc is rightfully complaining they can be used uninitialized here.

Comment 14 Hans de Goede 2008-04-18 13:28:52 UTC

Patch also send upstream:
https://bugs.freedesktop.org/show_bug.cgi?id=15580

Comment 15 Hans de Goede 2008-04-18 13:33:19 UTC

airlied, putting you in the CC as this is a _nasty_ drm bug, patch fixing this
attached, would be nice if this could be fixed before F-9 final.

Comment 16 Dave Airlie 2008-04-23 06:58:57 UTC

I've put what I think is the correct fix for this into kernel-2.6.25-8.fc9, I'd
appreciate testing it from koji if you get a chance.

Comment 17 Hans de Goede 2008-04-23 14:54:16 UTC

(In reply to comment #16)
> I've put what I think is the correct fix for this into kernel-2.6.25-8.fc9, I'd
> appreciate testing it from koji if you get a chance.

I can confirm that 2.6.25-8.fc9 no longer kernel panics when doing vsynced
opengl on my intel 82865G machine.

Note You need to log in before you can comment on or make changes to this bug.