Bug 699133 - ThinkPad x201 laggy after resume, i915 errors on hangcheck timer
Summary: ThinkPad x201 laggy after resume, i915 errors on hangcheck timer
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-23 12:25 UTC by Paul W. Frields
Modified: 2012-06-04 18:57 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-04 18:57:26 UTC
Type: ---


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 706293 0 unspecified CLOSED Regression in i915 leads to extremely sluggish desktop behaviour 2021-02-22 00:41:40 UTC

Internal Links: 706293

Description Paul W. Frields 2011-04-23 12:25:44 UTC
Description of problem:
When resuming from suspend, interactivity on ThinkPad x201 is extremely impaired.  Input from the keyboard produces response only in bursts several seconds apart.  The following messages repeat in the kernel log during this period:

Apr 23 08:17:32 jayne kernel: [127555.880022] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 3125, at 3125], missed IRQ?
Apr 23 08:17:34 jayne kernel: [127557.463152] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 3144, at 3144], missed IRQ?

Interestingly, if I run 'sudo powertop', the problem disappears once powertop has sampled the system activity and displayed its initial screen.  I can immediately quit powertop and everything is normal at the console once again.

Version-Release number of selected component (if applicable):
kernel-2.6.38.2-9.fc15.x86_64

How reproducible:
Every time.

Steps to Reproduce:
1. Start with ThinkPad x201, using F15 x86_64 along with current updates-testing
2. Suspend by closing lid
3. Resume by opening lid
4. Observe lag in interactivity and kernel messages above
  
Output of 'lspci -nn':
# --------------------------
00:00.0 Host bridge [0600]: Intel Corporation Core Processor DRAM Controller [8086:0044] (rev 02)
00:02.0 VGA compatible controller [0300]: Intel Corporation Core Processor Integrated Graphics Controller [8086:0046] (rev 02)
00:16.0 Communication controller [0780]: Intel Corporation 5 Series/3400 Series Chipset HECI Controller [8086:3b64] (rev 06)
00:16.3 Serial controller [0700]: Intel Corporation 5 Series/3400 Series Chipset KT Controller [8086:3b67] (rev 06)
00:19.0 Ethernet controller [0200]: Intel Corporation 82577LM Gigabit Network Connection [8086:10ea] (rev 06)
00:1a.0 USB Controller [0c03]: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller [8086:3b3c] (rev 06)
00:1b.0 Audio device [0403]: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio [8086:3b56] (rev 06)
00:1c.0 PCI bridge [0604]: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 [8086:3b42] (rev 06)
00:1c.3 PCI bridge [0604]: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 4 [8086:3b48] (rev 06)
00:1c.4 PCI bridge [0604]: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 5 [8086:3b4a] (rev 06)
00:1d.0 USB Controller [0c03]: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller [8086:3b34] (rev 06)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev a6)
00:1f.0 ISA bridge [0601]: Intel Corporation Mobile 5 Series Chipset LPC Interface Controller [8086:3b07] (rev 06)
00:1f.2 SATA controller [0106]: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller [8086:3b2f] (rev 06)
00:1f.3 SMBus [0c05]: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller [8086:3b30] (rev 06)
00:1f.6 Signal processing controller [1180]: Intel Corporation 5 Series/3400 Series Chipset Thermal Subsystem [8086:3b32] (rev 06)
02:00.0 Network controller [0280]: Intel Corporation Centrino Ultimate-N 6300 [8086:4238] (rev 35)
ff:00.0 Host bridge [0600]: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers [8086:2c62] (rev 02)
ff:00.1 Host bridge [0600]: Intel Corporation Core Processor QuickPath Architecture System Address Decoder [8086:2d01] (rev 02)
ff:02.0 Host bridge [0600]: Intel Corporation Core Processor QPI Link 0 [8086:2d10] (rev 02)
ff:02.1 Host bridge [0600]: Intel Corporation Core Processor QPI Physical 0 [8086:2d11] (rev 02)
ff:02.2 Host bridge [0600]: Intel Corporation Core Processor Reserved [8086:2d12] (rev 02)
ff:02.3 Host bridge [0600]: Intel Corporation Core Processor Reserved [8086:2d13] (rev 02)
# --------------------------

I am happy to provide additional information as needed.

Comment 1 Paul W. Frields 2011-04-23 16:44:29 UTC
Hm, seems to be fixed with kernel 2.6.38.3, at least in:
https://admin.fedoraproject.org/updates/kernel-2.6.38.3-18.fc15

Comment 2 Paul W. Frields 2011-04-25 15:27:06 UTC
Spoke too soon -- problem reasserted itself after a longer period of use.

Comment 3 Jérôme Oufella 2011-04-25 23:20:58 UTC
I confirm the same behaviour on a Lenovo T410s equipped with intel core i5 M520 integrated graphics (ironlake mobile).

I've been using fc15 for weeks and didn't notice it until a few days, which I think correspond to the upgrade to 2.6.38.3-18.fc15.i686.PAE on my side.

But as Paul mentions above, it's not easy to reproduce until some time of use, so I might have missed it.

Comment 4 simon 2011-05-17 17:41:53 UTC
I've the same issue on a dell latitude e6500 with kernel 2.6.38.6-26.rc1.fc15.x86_64. Resumed from suspend with lots of unresponsiveness and the error message.

Noticing the comment about powertop - I have also recently enabled runtime-pm options across the various pci bus devices as suggested by powertop which I had not done on previous kernel versions.

Comment 5 Paul W. Frields 2011-05-18 10:52:33 UTC
I've tested this a little further and it doesn't appear that my power tuning settings make a difference. The problem asserts itself after an extended period of use.

This is still a problem with kernel-2.6.38.6-27.fc15.x86_64.

Comment 6 Tom Livingston 2011-06-09 02:58:01 UTC
I experience this behavior and corresponding log entries as well, though irregularly. Acer 4820TG i3 CPU ironlake GPU kernel 2.6.38.7-30.fc15.x86_64. I have been using powertop suggested power tunings similar to other reports.

Comment 7 ell1e 2011-06-15 04:09:51 UTC
Same problem, powertop fixes it magically.

I have *no* powertop tunables in place that cover graphics, just all usb devices power managed, VM writeback increased, NMI watchdog off, audio codec power management on (snd-hda-intel), wake on lan off (both wlan and lan) and cpu scaler set to ondemand. All other tunables are "Bad"/untouched.

Kernel is 2.6.38.7-30.fc15.i686
Graphics card is 945GME on an Asus Eee 100H, xorg-x11-drv-intel is 2.15.0-3.fc15

Particularly with GNOME 3's prominent suspend usage, this problem is annoying (when not knowing about the work-around).

Some dmesg output:

[68631.792072] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 9876, at 9876], missed IRQ?
[68633.496104] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 9892, at 9894], missed IRQ?
[68635.020083] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 9898, at 9898], missed IRQ?
[68636.536051] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 9904, at 9904], missed IRQ?
[68638.064085] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... render ring idle [waiting on 9913, at 9913], missed IRQ?

Comment 8 Martin Langhoff 2011-06-16 15:04:38 UTC
Hitting the same bug on a Lenovo x220 -- kernel 2.6.38.8-32.fc15.x86_64 

Linus hit it http://lists.freedesktop.org/archives/dri-devel/2011-January/006987.html did some bisection that seems to have clarified the situation some - http://lists.freedesktop.org/archives/dri-devel/2011-January/007030.html and led to a fix that seems to be working http://lists.freedesktop.org/archives/dri-devel/2011-January/007041.html

Hopefully the fixup patch can be isolated and applied to 2.6.38 series on F15...

Comment 9 Martin Langhoff 2011-06-27 16:55:06 UTC
Looks like we have a dupe at #706293 - reported against 2.6.39 kernels - comments there hint that i915.semaphores=1 reduces the frequency.

From a comment: 
Appears to be fixed with 3.0-0.rc4.git0.2.fc16.x86_64, presumably by commit
498e720b96379d8ee9c294950a01534a73defcf3 "drm/i915: Fix gen6 (SNB) missed BLT
ring interrupts."

Comment 10 nayfield 2011-07-15 14:57:42 UTC
Same issue with f15 on a Toshiba NB505 laptop.  (This is the top selling netbook on Amazon.com).

00:02.1 Display controller: Intel Corporation N10 Family Integrated Graphics Controller
	Subsystem: Toshiba America Info Systems Device fdc0
	Flags: bus master, fast devsel, latency 0
	Memory at f0280000 (32-bit, non-prefetchable) [size=512K]
	Capabilities: [d0] Power Management version 2

Comment 11 Peter Robinson 2011-07-19 22:09:35 UTC
I'm seeing this on my Dell e6410 with Core i5 M540 CPU with IronLake GPU.

Looks like 684097 might be the same bug.

Upstream bug that looks similar https://bugs.freedesktop.org/show_bug.cgi?id=38529

Comment 12 Josh Boyer 2012-06-04 18:57:26 UTC
This should have been fixed in the 3.x kernels.  If you are still seeing this issue with the 3.3 F16 or newer kernel, please reopen.


Note You need to log in before you can comment on or make changes to this bug.