Bug 510948

Summary: Xorg server freezes when running some Mesa apps
Product: [Fedora] Fedora Reporter: Patrick O'Callaghan <poc>
Component: xorg-x11-drv-intelAssignee: Adam Jackson <ajax>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 11CC: 04mvs89, ajax, brian, dominik.stadler, mcepl, mishu, poc, Rick0157, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-28 13:35:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Gdb trace of X freeze testcase none

Description Patrick O'Callaghan 2009-07-12 18:45:47 UTC
Description of problem: when running the foobillard billiards simulation game, the X server freezes. The mouse still moves but no buttons have any effect, not do keyboard commands. The rest of the system is still up (can log in via ssh) but the server cannot be killed. System reboot is the only option.


Version-Release number of selected component (if applicable):xorg-x11-server-Xorg-1.6.1.901-2.fc11.x86_64 and xorg-x11-server-Xorg-1.6.2-2.fc11.x86_64, foobillard-3.0a-12.x86_64, xorg-x11-drv-intel-2.7.0-7.fc11.x86_64.


How reproducible: 100% repeatable


Steps to Reproduce:
1. Start foobillard
2. Play a shot
3. After balls come to rest, X freezes
  
Actual results:
Frozen X server

Expected results:
Working X server

Additional info:
This bug is new with Xorg-1.6.1.901-2.fc11.x86_64. In previous versions the game worked fine. I updated from stable to 1.6.2-2 from Koji but it made no difference. The attached gdb trace is from the Koji version. There is no xconf file.

Comment 1 Patrick O'Callaghan 2009-07-12 18:47:54 UTC
Created attachment 351401 [details]
Gdb trace of X freeze testcase

Gdb trace taken from an ssh session on another machine. Note that after the freeze I had to interrupt gdb to get the console and do the traceback.

Comment 2 Patrick O'Callaghan 2009-07-15 16:07:20 UTC
I experimented with xorg-x11-drv-intel-2.8.0-0.3.fc12.x86_64.rpm (from Rawhide) and it *seems* to fix the problem. Any chance of this being made available for F11 (I'm nervous about mixing Fedora rpms at this stage)?

Comment 3 Patrick O'Callaghan 2009-07-15 16:08:07 UTC
Appears very similar to https://bugzilla.redhat.com/show_bug.cgi?id=509598

Comment 4 Matěj Cepl 2009-07-23 16:33:07 UTC
*** Bug 509598 has been marked as a duplicate of this bug. ***

Comment 5 Patrick O'Callaghan 2009-07-23 19:20:36 UTC
A couple of additional comments:

1) A few seconds after the X server freezes I get a brief audible tone or beep, different from the standard system bell, which may indicate some buffer has filled up.

2) It's not even necessary to play a shot in foobillard. The freeze happens even when I just hit a keyboard command (ESC for the Help menu) with no mouse interaction.

If there are any suggestions for getting better tracebacks, I'd be happy to try. I really want this bug fixed.

Comment 6 Patrick O'Callaghan 2009-07-25 00:59:16 UTC
Could also be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=510169. The recommendation there is to downgrade mesa. I'll report back when I've tried it.

Comment 7 Patrick O'Callaghan 2009-07-25 02:40:04 UTC
Confirmed: downgrading to Mesa 7.5-0.14.fc11.x86_64 (from 7.6-0.1.fc11.x86_64) makes the bug go away.

Comment 8 Patrick O'Callaghan 2009-07-27 00:22:54 UTC
The problem can also be reproduced with /usr/lib64/mesa/winpos from mesa-demos-7.6-0.1.fc11.x86_64. Run the demo and the X server freezes immediately. It also gives the strange beep after a few seconds so looks like the same bug.

Comment 9 Patrick O'Callaghan 2009-08-04 18:48:01 UTC
I neglected to mention that turning mode-setting on or off makes no difference.

Comment 10 Patrick O'Callaghan 2009-08-21 20:47:38 UTC
The problem seems to be fixed with kernel-2.6.30.5-32.fc11.x86_64 (currently in updates-testing). Both foobillard and the winpos demo are working correctly. Note that I didn't need to change xorg-x11-drv-intel or Mesa to get this to work.

Comment 11 Patrick O'Callaghan 2009-10-02 18:54:32 UTC
It seems to have returned in kernel-2.6.30.5-43.fc11.x86_64 and kernel-2.6.30.8-64.fc11.x86_64, with two differences: 1) it's much less frequent e.g. the earlier problem was completely reproducible every time I ran foobillard whereas the new one only happens maybe once every 5 or so runs, and 2) now the mouse is also frozen where earlier it was movable.

The winpos demo also seems not to cause the freeze (in a small number of tests) whereas earlier it was solid.

Comment 12 Rick Retterer 2009-10-05 06:06:13 UTC
Patrick,
  I concur with you.  The hang problem has returned in the kernels 

kernel-2.6.30.5-43.fc11.i586
-and-
kernel-2.6.30.8-64.fc11.i586

For me, the symptoms are the same as before... Screen locks up completely
No errors are logged into the messages file, or the Xorg.0.log or .xsession-errors files.

To reproduce the problem, run any gl* application or video playback software such as xine and as soon as the application comes up *BAM* hangs solid.
GL-Cairo-Dock will hang it solid also.

It also hangs on my screen saver GlMatrix, GLPlanet, etc...

I'm using the 82915G/GV/910GL Integrated Graphics Controller [8086:2582]

driver: i915
	latency: 0
resources:
	irq: 16
	memory: cfe00000-cfe7ffff
	ioport: 1800(size=8)
	memory: e0000000-efffffff(prefetchable)
	memory: cff00000-cff3ffff


1GB of RAM memory, 1GB of SWAP space and 650GB of disk space.

product: Intel(R) Pentium(R) 4 CPU 3.20GHz
vendor: Intel Corp.
bus info: cpu@0
version: 15.4.1
serial: 0000-0F41-0000-0000-0000-0000
slot: XU1 PROCESSOR
size: 3200MHz
capacity: 3200MHz
width: 32 bits
clock: 800MHz

I wish I could get a decent error message to go on, but here recently I haven't.  I have also noticed that using the kernel argument of "nomodeset" will not allow my system startup up the Xorg Xserver... it just loops over and over and over again starting the X server, then crashes and restarts and crashes again.

Anyway, that's my problem with the Xorg intel video drivers.

Cheers,
Rick Retterer

Comment 13 Rick Retterer 2009-10-05 06:20:08 UTC
I did find some errors in my /var/messages file:

Sep  7 01:20:52 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty
Sep  7 01:20:52 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12
Sep  7 01:20:52 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1
Sep  7 01:20:52 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty
Sep  7 01:20:52 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12
Sep  7 01:20:52 localhost kernel: [drm:i915_gem_execbuffer] *ERROR* Failed to pin
 buffers -12
Sep  7 01:21:02 localhost kernel: [drm:i915_gem_object_pin_and_relocate] *ERROR*
No GTT space found for object 5
Sep  7 01:21:02 localhost kernel: [drm:i915_gem_execbuffer] *ERROR* Failed to pin
 buffers -22
Sep  7 01:21:02 localhost kernel: [drm:i915_gem_object_pin_and_relocate] *ERROR*
No GTT space found for object 5
Sep  7 01:21:02 localhost kernel: [drm:i915_gem_execbuffer] *ERROR* Failed to pin
 buffers -22
Sep  7 01:21:02 localhost kernel: [drm:i915_gem_object_pin_and_relocate] *ERROR*
No GTT space found for object 5
Sep  7 01:21:02 localhost kernel: [drm:i915_gem_execbuffer] *ERROR* Failed to pin
 buffers -22
Sep  7 01:21:03 localhost kernel: [drm:i915_gem_object_pin_and_relocate] *ERROR*
No GTT space found for object 5
Sep  7 01:21:03 localhost kernel: [drm:i915_gem_execbuffer] *ERROR* Failed to pin
 buffers -22
<snip><snip>
cut here to save space...
<snip><snip>
Sep  7 02:11:13 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:11:13 localhost kernel: [drm:i915_gem_execbuffer] *ERROR* Failed to pin
 buffers -12                                                                     
Sep  7 02:11:13 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:11:13 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:12:56 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:12:56 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:12:56 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1                                        
Sep  7 02:13:03 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:13:03 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:13:03 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1                                        
Sep  7 02:13:10 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:13:10 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:13:10 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1                                        
Sep  7 02:14:35 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:14:35 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:14:35 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1                                        
Sep  7 02:17:35 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:17:35 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:17:35 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1                                        
Sep  7 02:26:41 localhost kernel: [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT f
ull, but LRU list empty                                                          
Sep  7 02:26:41 localhost kernel: [drm:i915_gem_object_pin] *ERROR* Failure to bi
nd: -12                                                                          
Sep  7 02:26:41 localhost kernel: [drm:i915_gem_evict_something] *ERROR* inactive
 empty 1 request empty 1 flushing empty 1                                
System Reboot initiated->>>
Sep  7 03:14:04 localhost init: tty5 main process (2544) killed by TERM signal   
Sep  7 03:14:04 localhost init: tty6 main process (2547) killed by TERM signal   
Sep  7 03:14:04 localhost init: tty2 main process (2545) killed by TERM signal   
Sep  7 03:14:04 localhost init: tty3 main process (2546) killed by TERM signal   
Sep  7 03:14:04 localhost init: tty4 main process (2543) killed by TERM signal   
Sep  7 03:14:04 localhost gdm-simple-greeter[29345]: WARNING: Cancel org.freedesk
top.DBus.Error.NoReply raised: Did not receive a reply. Possible causes include: 
the remote application did not send a reply, the message bus security policy bloc
ked the reply, the reply timeout expired, or the network connection was broken.#0
12                                                                               
Sep  7 03:14:04 localhost gnome-keyring-daemon[3257]: dbus failure unregistering 
from session: Connection is closed                                               


These errors were repeated over and over and over again, until the system had to be rebooted to clear the problem.  During the time that these errors were being logged, I had to telnet into the system and shut it down and reboot it.

Rick

Comment 14 Patrick O'Callaghan 2009-10-05 13:06:34 UTC
I don't have any problems with video playback using Xine, VLC or Dragon. I don't think these are OpenGL apps. Nor do I have any drm-related messages in the log. Possibly you have a different issue (or more than one issue) but several of those messages look like problems with library bindings rather than kernel bugs. I doubt the gnome-keyring-daemon stuff is relevant either.

However the fact that you are getting hangs with i586 means the bug is not just an x86_64 issue.

Comment 16 Matěj Cepl 2009-11-05 18:34:33 UTC
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages. For packages from updates-testing repository you can use command

yum upgrade --enablerepo='*-updates-testing'

Alternatively, you can also try to test whether this bug is reproducible with the upcoming Fedora 12 distribution by downloading LiveMedia of F12 Beta available at http://alt.fedoraproject.org/pub/alt/nightly-composes/ . By using that you get all the latest packages without need to install anything on your computer. For more information on using LiveMedia take a look at https://fedoraproject.org/wiki/FedoraLiveCD .

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]

Comment 17 Patrick O'Callaghan 2009-11-06 02:52:53 UTC
I've tested most of the demos in /usr/lib64/mesa and the foobillard app without managing to reproduce the problem under kernel-2.6.30.9-90.fc11.x86_64.

For the moment I'm going to assume the problem is fixed.

Comment 18 Brian 2009-12-11 20:58:48 UTC
I'm seeing similar problems (corrupted 3d data, occasional crash) on a stock clean Redhat 12 install + updates + Google Earth latest version. System uses an Asus p5k-vm G33-based mobo (intel xorg driver)

Comment 19 Brian 2010-04-27 03:05:37 UTC
Further to the last comment I made, I found that upgrading to Mesa 7.7-3 fixed my particular problems.

Comment 20 Bug Zapper 2010-04-27 15:38:20 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 21 Bug Zapper 2010-06-28 13:35:46 UTC
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.