Bug 706293

Summary: Regression in i915 leads to extremely sluggish desktop behaviour
Product: [Fedora] Fedora Reporter: Lars Seipel <ls>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: brianmury, gansalmon, guattari, itamar, jistone, jonathan, kernel-maint, madhu.chinakonda, mangobrain, ntl, timosha
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-11 20:29:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
dmesg output none

Description Lars Seipel 2011-05-19 23:04:53 UTC
Created attachment 499965 [details]
dmesg output

Description of problem:
I'm using Fedora on a Thinkpad X220 based on Intel's Sandy Bridge processor and GPU. It runs a kernel from Rawhide on top of the F15 userspace.

While I had minor issues with F15's 2.6.38 kernel the system ran absolutely flawless with 2.6.39-0.rc7.git0.0 (checked out from git with fedpkg). Then I built and installed 2.6.39-0.rc7.git6.0 and after reboot the system felt very sluggish. Hitting the "hot corner" in gnome-shell sometimes doesn't even produce a reaction. When this happens, dmesg gets filled with messages like these:
[   40.694093] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 2458, at 2458], missed IRQ?
[   45.012293] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 2813, at 2813], missed IRQ?
[  441.453869] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 36206, at 36206], missed IRQ?

There seems to be one of those entries for every time I try hitting the hot corner. 2.6.39-0.rc7.git6.1 (from rawhide repo) and the final 2.6.39 don't change this behaviour in any way. The 2.6.38 kernel from F15 exhibits the problem in a much less severe way, i.e. it doesn't occur right after reboot but more likely after waking up from suspend and is not really reproducible. 2.6.39-0.rc7.git0.0 doesn't show the issue at all in days of constant use.

Just tell me if you need any additional information. 

Version-Release number of selected component (if applicable):

kernel-2.6.39-0.rc7.git0.0.mfq.fc16.x86_64 (selfbuilt)
fine

kernel-2.6.39-0.rc7.git6.1.fc16.x86_64 (from Fedora repos)
broken

kernel-2.6.39-0.mfq.fc16.x86_64 (selfbuilt)
broken

How reproducible:
100% with 2.6.39 kernels after 0.rc7.git6.0

Steps to Reproduce:
1. boot Rawhide kernel on Thinkpad X220 (Core i5, SandyBridge) and login to Gnome
2. use the Desktop (the hot corner seems pretty reliable for provoking this)
3. get some really sluggish user experience and watch dmesg fill up
  
Actual results:
User experience is sluggish. System log gets filled with the messages above
 
Expected results:
User experience is fine like with kernel-2.6.39-0.rc7.git0.0. System logs stays clean.

Additional info:

Comment 1 Lars Seipel 2011-05-22 11:10:40 UTC
Re-enabling the use of semaphores by booting with i915.semaphores=1 fixes the problem for me. It seems that it is caused by upstream commit 087fbc9962e10a65fb0b542ecfc116ebf6cf1735 which ought to fix things for other people.

Comment 2 Nathan Lynch 2011-05-31 14:02:38 UTC
Seeing this issue with a Core i7 Sandy Bridge system (F15 userspace) as well.

kernel-2.6.39-0.rc6.git0.0.fc16.x86_64: good
kernel-2.6.39-1.fc16.x86_64:            bad

Comment 3 Daniel Adad 2011-06-08 13:42:30 UTC
The current version of the kernel for F15 also shows this behavior with a Sandy Bridge Core i5-2500K.

2.6.38.7-30.fc15.x86_64

Comment 4 Nathan Lynch 2011-06-13 12:07:40 UTC
kernel-3.0-0.rc2.git0.2.fc16.x86_64:    bad

Comment 5 Josh Stone 2011-06-13 20:57:20 UTC
(In reply to comment #3)
> The current version of the kernel for F15 also shows this behavior with a Sandy
> Bridge Core i5-2500K.
> 
> 2.6.38.7-30.fc15.x86_64

I have this issue on the same kernel with i7-2600.  So far, i915.semaphores=1 has been behaving well.

Comment 6 Nathan Lynch 2011-06-14 23:26:17 UTC
Possible fix here:

https://lkml.org/lkml/2011/6/14/270

Seems to address the issue for me when applied to v3.0-rc3.

Comment 7 Nathan Lynch 2011-06-23 19:52:02 UTC
Appears to be fixed with 3.0-0.rc4.git0.2.fc16.x86_64, presumably by commit 498e720b96379d8ee9c294950a01534a73defcf3 "drm/i915: Fix gen6 (SNB) missed BLT ring interrupts."

Comment 8 Timon 2011-06-30 07:31:36 UTC
Yup. Seems to be fixed. Gnome3 became so fast and no more annoying errors in dmesg.
kernel-3.0-0.rc5.git0.1.fc15.x86_64

Comment 9 Philip Allison 2011-07-04 07:40:12 UTC
Dupe of 684097?  Comment 7 here references one of the same patches suggested for application in that bug's thread, which resolve this issue for me when applied to 2.6.38.8-32.fc15.x86_64.

Please back-port these!