Bug 733832

Summary: Screen locks, drm:i915_hangcheck_elapsed
Product: Red Hat Enterprise Linux 6 Reporter: Zenon Panoussis <redhatbugs>
Component: mesaAssignee: Dave Airlie <airlied>
Status: CLOSED CURRENTRELEASE QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: cn6uw7d02, gnafu_the_great, roland, tpelka
Target Milestone: rcKeywords: OtherQA, Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-03 19:12:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 840699, 842499    
Attachments:
Description Flags
Xorg.log none

Description Zenon Panoussis 2011-08-27 10:24:13 UTC
After an update of - among others - the mesa packages, X started locking. The locks happened every time within no more than 5 minutes after logging in and affected both keyboard and mouse. dmesg reported 
 [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
It was still possible too ssh into the box, but nothing short of a reboot could unlock X. The mainboard is an Intel D946GZIS, kernel 2.6.32-71.24.1.el6.x86_64. 

Downgrading to the following seems to have "solved" the problem:

 libdrm                 x86_64   2.4.20-2.el6
 libdrm-devel           x86_64   2.4.20-2.el6
 mesa-dri-drivers       x86_64   7.7-2.el6
 mesa-libGL             x86_64   7.7-2.el6
 mesa-libGL-devel       x86_64   7.7-2.el6
 xorg-x11-drv-intel     x86_64   2.11.0-7.el6
 xorg-x11-drv-nouveau   x86_64   1:0.0.16-8.20100423git13c1043.el6
 mesa-libGLU            x86_64   7.7-2.el6
 mesa-libGLU-devel      x86_64   7.7-2.el6

This looks identical to bug #695034 filed against Fedora 14.

Comment 2 Gideon Mayhak 2011-08-27 20:44:16 UTC
I'm seeing this same issue with an 845G:

00:02.0 VGA compatible controller [0300]: Intel Corporation 82845G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device [8086:2562] (rev 01) (prog-if 00 [VGA controller])
	Subsystem: Dell Device [1028:0126]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at e8000000 (32-bit, prefetchable) [size=128M]
	Region 1: Memory at ff680000 (32-bit, non-prefetchable) [size=512K]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915

I've seen reports of setting pci=nocrs helping on other distros, so I'm going to try that, but it would be nice if this could work out of the box.

Comment 3 Gideon Mayhak 2011-08-27 20:51:29 UTC
It looks like the nocrs option was added in the .34 kernel, so that didn't work.

Comment 4 Dave Airlie 2011-08-29 15:49:48 UTC
Zenon,

can you supply a dmesg and Xorg.0.log file from that machine.

Gideon, the 845 is not the exact same issue, lots of lockups look the same but rarely have anything to do with each other, so if the 845 is a regression since 6.0 please file a separate bug.

Thanks,
Dave.

Comment 6 Zenon Panoussis 2011-08-29 16:51:56 UTC
Created attachment 520431 [details]
Xorg.log

Xorg.log attached. It's a pity it has no timestamps, but the repetitive entries suggest it's the correct log for the multiple reboots that followed the freezes.

Comment 7 Dave Airlie 2011-08-30 09:53:19 UTC
just some more info, its not the F14 bug. That is on a completely different Intel system.

Comment 8 Zenon Panoussis 2011-08-30 12:45:58 UTC
BTW, I didn't miss the dmesg part of "dmesg and Xorg.0.log"; it's only that I didn't keep a copy of dmesg, so now need to upgrade to reproduce the problem, then downgrade again to get a working machine, and that's the kind of nuisance that has to fit in a window of relative boredom.

Comment 9 Roland Roberts 2011-09-07 15:39:22 UTC
Similar problem with a Thinkpad 420s, Core i5. I'm actually running Scientific Linux 6.1 and the issue is happening on a LTSP client. However, the hang DOES clear after several minutes, at least it always has so far.

My syslog/dmesg show nothing before the hangcheck message, and nothing after it. That is, the previous entry is from the hours earlier, the subsequent entry from actions I took after the hang cleared.

Comment 11 japa-fi 2011-10-24 16:05:58 UTC
I have the same issue sometimes with my Thinkpad T400 after waking up from suspend. 
X doesn't show up, one of the consoles display i915_hangcheck_elapsed messages. Killing the gnome-session does not help.

Kernel 2.6.40.6-0.fc15.x86_64

Comment 14 Zenon Panoussis 2012-07-17 09:20:13 UTC
Incidentally, my problem has gone away. No idea when. I had forgotten about this bug, my system got updated, the problem didn't come back. SL6.2 now. 

libdrm-2.4.25-2.el6.x86_64
libdrm-devel-2.4.25-2.el6.x86_64
mesa-dri-drivers-7.11-3.el6.x86_64
mesa-libGL-7.11-3.el6.x86_64
mesa-libGL-devel-7.11-3.el6.x86_64
xorg-x11-drv-intel-2.16.0-1.el6.x86_64
xorg-x11-drv-nouveau-0.0.16-13.20110719gitde9d1ba.el6.x86_64
mesa-libGLU-7.11-3.el6.x86_64
mesa-libGLU-devel-7.11-3.el6.x86_64
running kernel 2.6.32-220.4.1.el6.x86_64

Comment 16 Adam Jackson 2012-08-03 19:12:03 UTC
Closing per comment #14, thanks!