Bug 645968 - 2.6.36 Breaks/choppy mouse on Suspend/Resume to RAM. Disabling IRQ #16
2.6.36 Breaks/choppy mouse on Suspend/Resume to RAM. Disabling IRQ #16
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
x86_64 Linux
low Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-10-23 07:19 EDT by Andy Lawrence
Modified: 2011-04-22 18:12 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-01-24 19:40:20 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
dmesg (73.62 KB, text/x-log)
2010-10-23 07:19 EDT, Andy Lawrence
no flags Details
messages (269.63 KB, text/x-log)
2010-10-23 07:20 EDT, Andy Lawrence
no flags Details
pm-suspend (5.23 KB, text/x-log)
2010-10-23 07:21 EDT, Andy Lawrence
no flags Details
lsusb output (34.62 KB, text/x-log)
2010-11-24 20:16 EST, Andy Lawrence
no flags Details
lsusb with bluetooth enabled this time (40.01 KB, text/x-log)
2010-11-24 20:28 EST, Andy Lawrence
no flags Details
powertop screenshot 1 (68.78 KB, image/png)
2010-11-29 16:31 EST, Andy Lawrence
no flags Details
powertop screenshot 2 (86.53 KB, image/png)
2010-11-29 16:32 EST, Andy Lawrence
no flags Details
perf output normal no chop (76.52 KB, text/x-log)
2010-12-02 17:23 EST, Andy Lawrence
no flags Details
perf output while choppy (20.81 KB, text/x-log)
2010-12-03 16:21 EST, Andy Lawrence
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Linux Kernel 28892 None None None Never

  None (edit)
Description Andy Lawrence 2010-10-23 07:19:59 EDT
Created attachment 455240 [details]
dmesg

Description of problem:

Kernel 2.6.36-1.fc15.x86_64 makes for a choppy erratic mouse after awaking from a suspend to ram.  Mouse will work but the action if slow and choppy. This is the same for both the USB mouse and the internal trackpad.  Booting back to kernel 2.6.35.6-45 (or any other 2.6.35 kernel) and the problem goes away and all is normal.

One notable output while performing suspend: Disabling IRQ #19

This machine is a fully up-2-date F14 Acer 1810TZ Laptop.  Bouncing back and forth between the two kernels list above will either cause or fix the choppy mouse issue.

Additional info:

All necessary logs to follow.
Comment 1 Andy Lawrence 2010-10-23 07:20:36 EDT
Created attachment 455242 [details]
messages
Comment 2 Andy Lawrence 2010-10-23 07:21:08 EDT
Created attachment 455243 [details]
pm-suspend
Comment 3 Andy Lawrence 2010-11-20 19:22:35 EST
This is still happening with F14 Release all updates with a F15 kernel, 2.6.36-5.fc15.x86_64.

Some additional info:

Nov 20 09:40:26 ace NetworkManager[1803]: <info> Updating /etc/hosts with new system hostname
Nov 20 09:40:26 ace NetworkManager[1803]: <info> (wlan0): cleaning up...
Nov 20 09:40:26 ace NetworkManager[1803]: <info> (wlan0): taking down device.
Nov 20 09:40:26 ace nm-dispatcher.action: nm_dispatcher_action: Invalid connection: '(null)' / 'connection setting not found' invalid: 1
Nov 20 09:40:26 ace kernel: [133022.457626] irq 16: nobody cared (try booting with the "irqpoll" option)
Nov 20 09:40:26 ace kernel: [133022.457636] Pid: 865, comm: pm-suspend Tainted: G          I 2.6.36-5.fc15.x86_64 #1
Nov 20 09:40:26 ace kernel: [133022.457640] Call Trace:
Nov 20 09:40:26 ace kernel: [133022.457643]  <IRQ>  [<ffffffff810a8d13>] __report_bad_irq.clone.1+0x3d/0x8b
Nov 20 09:40:26 ace kernel: [133022.457662]  [<ffffffff810a8e7b>] note_interrupt+0x11a/0x17f
Nov 20 09:40:26 ace kernel: [133022.457668]  [<ffffffff810a995b>] handle_fasteoi_irq+0xa8/0xce
Nov 20 09:40:26 ace kernel: [133022.457676]  [<ffffffff8100c34e>] handle_irq+0x88/0x90
Nov 20 09:40:26 ace kernel: [133022.457683]  [<ffffffff8146ad74>] do_IRQ+0x5c/0xb4
Nov 20 09:40:26 ace kernel: [133022.457690]  [<ffffffff81464f53>] ret_from_intr+0x0/0x11
Nov 20 09:40:26 ace kernel: [133022.457693]  <EOI>  [<ffffffff810eb087>] ? do_wp_page+0x5b/0x593
Nov 20 09:40:26 ace kernel: [133022.457706]  [<ffffffff810eb065>] ? do_wp_page+0x39/0x593
Nov 20 09:40:26 ace kernel: [133022.457713]  [<ffffffff8103c4d7>] ? should_resched+0xe/0x2e
Nov 20 09:40:26 ace kernel: [133022.457719]  [<ffffffff8103c4bf>] ? need_resched+0x23/0x2d
Nov 20 09:40:26 ace kernel: [133022.457725]  [<ffffffff810e9e65>] ? pmd_offset+0x19/0x40
Nov 20 09:40:26 ace kernel: [133022.457730]  [<ffffffff810ece33>] handle_mm_fault+0x7fa/0x880
Nov 20 09:40:26 ace kernel: [133022.457737]  [<ffffffff810124fb>] ? might_fault+0x21/0x23
Nov 20 09:40:26 ace kernel: [133022.457744]  [<ffffffff81467e54>] do_page_fault+0x250/0x265
Nov 20 09:40:26 ace kernel: [133022.457750]  [<ffffffff81465215>] page_fault+0x25/0x30
Nov 20 09:40:26 ace kernel: [133022.457754] handlers:
Nov 20 09:40:26 ace kernel: [133022.457757] [<ffffffff813305ab>] (usb_hcd_irq+0x0/0x7c)
Nov 20 09:40:26 ace kernel: [133022.457765] [<ffffffff813305ab>] (usb_hcd_irq+0x0/0x7c)
Nov 20 09:40:26 ace kernel: [133022.457772] Disabling IRQ #16
Nov 20 09:40:27 ace kernel: [133023.072906] PM: Syncing filesystems ... done.
Nov 20 19:05:17 ace kernel: [133023.120583] Freezing user space processes ... (elapsed 0.01 seconds) done.
Nov 20 19:05:17 ace kernel: [133023.132181] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
Nov 20 19:05:17 ace kernel: [133023.143190] Suspending console(s) (use no_console_suspend to debug)
Nov 20 19:05:17 ace kernel: [133023.143417] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Nov 20 19:05:17 ace kernel: [133023.154085] ehci_hcd 0000:00:1a.7: BAR 0: set to [mem 0xd4504c00-0xd4504fff] (PCI address [0xd4504c00-0xd4504fff]
Nov 20 19:05:17 ace kernel: [133023.154215] ehci_hcd 0000:00:1a.7: PCI INT D -> GSI 19 (level, low) -> IRQ 19
Nov 20 19:05:17 ace kernel: [133023.154330] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
Nov 20 19:05:17 ace kernel: [133023.154452] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
Nov 20 19:05:17 ace kernel: [133023.165079] ehci_hcd 0000:00:1d.7: BAR 0: set to [mem 0xd4504800-0xd4504bff] (PCI address [0xd4504800-0xd4504bff]
Nov 20 19:05:17 ace kernel: [133023.165209] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
Nov 20 19:05:17 ace kernel: [133023.170281] sd 0:0:0:0: [sda] Synchronizing SCSI cache
Nov 20 19:05:17 ace kernel: [133023.221153] sd 0:0:0:0: [sda] Stopping disk
Nov 20 19:05:17 ace kernel: [133023.222840] ehci_hcd 0000:00:1d.7: PCI INT A disabled
Nov 20 19:05:17 ace kernel: [133023.222895] uhci_hcd 0000:00:1d.1: PCI INT B disabled
Nov 20 19:05:17 ace kernel: [133023.222930] uhci_hcd 0000:00:1d.0: PCI INT A disabled
Nov 20 19:05:17 ace kernel: [133023.223065] ehci_hcd 0000:00:1a.7: PCI INT D disabled
Nov 20 19:05:17 ace kernel: [133023.223083] uhci_hcd 0000:00:1a.0: PCI INT A disabled
Nov 20 19:05:17 ace kernel: [133023.325025] HDA Intel 0000:00:1b.0: PCI INT A disabled
Nov 20 19:05:17 ace kernel: [133023.368080] uhci_hcd 0000:00:1d.2: PCI INT D disabled
Nov 20 19:05:17 ace kernel: [133023.712166] PM: suspend of devices complete after 546.551 msecs
Nov 20 19:05:17 ace kernel: [133023.734206] PM: late suspend of devices complete after 22.032 msecs
Nov 20 19:05:17 ace kernel: [133023.734634] ACPI: Preparing to enter system sleep state S3
Nov 20 19:05:17 ace kernel: [133023.734780] PM: Saving platform NVS memory
Nov 20 19:05:17 ace kernel: [133023.763572] Disabling non-boot CPUs ...
Nov 20 19:05:17 ace kernel: [133023.764963] CPU 1 is now offline
Nov 20 19:05:17 ace kernel: [133023.764966] SMP alternatives: switching to UP code
Nov 20 19:05:17 ace kernel: [133023.771864] PM: Restoring platform NVS memory
Nov 20 19:05:17 ace kernel: [133023.771864] Enabling non-boot CPUs ...
Nov 20 19:05:17 ace kernel: [133023.772113] SMP alternatives: switching to SMP code
Nov 20 19:05:17 ace kernel: [133023.778396] Booting Node 0 Processor 1 APIC 0x1
Nov 20 19:05:17 ace kernel: [133023.849339] NMI watchdog enabled, takes one hw-pmu counter.
Nov 20 19:05:17 ace kernel: [133023.850083] CPU1 is up
Nov 20 19:05:17 ace kernel: [133023.850812] ACPI: Waking up from system sleep state S3
Nov 20 19:05:17 ace kernel: [133023.985037] PM: early resume of devices complete after 1.353 msecs
Nov 20 19:05:17 ace kernel: [133023.986118] [drm:init_ring_common] *ERROR* render ring head not reset to zero ctl 00000000 head 02001000 tail 00000000 start 02001000
Nov 20 19:05:17 ace kernel: [133023.986126] [drm:init_ring_common] *ERROR* render ring head forced to zero ctl 


The system is not useable via mouse, all other functions seems fine.
Comment 4 Kyle McMartin 2010-11-20 21:00:40 EST
Hi, 

Can you paste in your /proc/interrupts? Sounds like a handler is racing on resume (so we see an interrupt without a handler that ACKs it.)

(Sorry it took so long to see your bug.)
Thanks!
 Kyle

(Also, it might be worth trying the kernel from
http://repos.fedorapeople.org/repos/kyle/kernel/fedora-kernel.repo which is the latest (more or less) 2.6.37 from git.)
Comment 5 Andy Lawrence 2010-11-20 21:34:47 EST
No Problem!

/proc/interrupts Before a suspend when all is working:

[andy@ace trunk]$ cat /proc/interrupts 
           CPU0       CPU1       
  0:     710007    1003868   IO-APIC-edge      timer
  1:       2039       2073   IO-APIC-edge      i8042
  8:          1          0   IO-APIC-edge      rtc0
  9:      13013        138   IO-APIC-fasteoi   acpi
 12:     209790       5740   IO-APIC-edge      i8042
 16:      89186       1962   IO-APIC-fasteoi   uhci_hcd:usb3, uhci_hcd:usb6
 19:       3215       3280   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
 23:         18         19   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb4
 40:      39891      41825   PCI-MSI-edge      ahci
 41:      10540      12272   PCI-MSI-edge      i915
 42:     583320        723   PCI-MSI-edge      iwlagn
 43:       4165       3223   PCI-MSI-edge      hda_intel
 44:          1          1   PCI-MSI-edge      eth0
NMI:        477        463   Non-maskable interrupts
LOC:    1223897    1211075   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:        477        463   Performance monitoring interrupts
PND:          0          0   Performance pending work
RES:      64187      72696   Rescheduling interrupts
CAL:       5887       5606   Function call interrupts
TLB:      19575      22862   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:         25         25   Machine check polls
ERR:          1
MIS:          0
[andy@ace trunk]$ 


/proc/interrupts After resume/suspend with choppy mouse:


[andy@ace trunk]$ cat /proc/interrupts 
           CPU0       CPU1       
  0:     725088    1013512   IO-APIC-edge      timer
  1:       2143       2092   IO-APIC-edge      i8042
  8:          1          0   IO-APIC-edge      rtc0
  9:      13318        138   IO-APIC-fasteoi   acpi
 12:     211150       5740   IO-APIC-edge      i8042
 16:     198039       1962   IO-APIC-fasteoi   uhci_hcd:usb3, uhci_hcd:usb6
 19:       3215       3280   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
 23:         35         19   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb4
 40:      40417      42285   PCI-MSI-edge      ahci
 41:      10830      12722   PCI-MSI-edge      i915@pci:0000:00:02.0
 42:     589730        723   PCI-MSI-edge      iwlagn
 43:        108        138   PCI-MSI-edge      hda_intel
 44:          0          0   PCI-MSI-edge      eth0
NMI:        482        468   Non-maskable interrupts
LOC:    1237947    1227637   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:        482        468   Performance monitoring interrupts
PND:          0          0   Performance pending work
RES:      64849      73477   Rescheduling interrupts
CAL:       5996       5750   Function call interrupts
TLB:      20123      23138   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:         26         26   Machine check polls
ERR:          3
MIS:          0
[andy@ace trunk]$ 

With 2.6.37-0.1.rc2.git0.fc15.x86_64, I've completed 3 suspend/resume cycles and so far no choppy mouse, so maybe already fixed in GIT.  Please give me a day or two to 100% confirm it's fixed and I'll report back.

Here is some output from messages on 2.6.37-0.1.rc2.git0.fc15.x86_64 after the suspend/resume cycles, just in-case it helps, but like I said, so far so good!

Thanks Kyle!

Nov 20 21:26:08 ace kernel: [  247.712154] uhci_hcd 0000:00:1d.1: PCI INT B disabled
Nov 20 21:26:08 ace kernel: [  247.712279] uhci_hcd 0000:00:1a.0: PCI INT A disabled
Nov 20 21:26:08 ace kernel: [  247.712370] uhci_hcd 0000:00:1d.0: PCI INT A disabled
Nov 20 21:26:08 ace kernel: [  247.723113] ehci_hcd 0000:00:1a.7: PCI INT D disabled
Nov 20 21:26:09 ace kernel: [  248.022993] input: Microsoft Bluetooth Notebook Mouse 5000 as /devices/pci0000:00/0000:00:1d.2/usb6/6-2/6-2:1.0/bluetooth/hci0/hci0:11/input12
Nov 20 21:26:09 ace kernel: [  248.025176] generic-bluetooth 0005:045E:0700.0003: input,hidraw0: BLUETOOTH HID v1.00 Mouse [Microsoft Bluetooth Notebook Mouse 5000] on 00:14:A4:FC:19:3A
Nov 20 21:26:09 ace kernel: [  248.715080] ehci_hcd 0000:00:1d.7: PCI INT A disabled
Nov 20 21:26:19 ace kernel: [  258.705682] btusb_intr_complete: hci0 urb ffff88009f76e900 failed to resubmit (1)
Nov 20 21:26:19 ace kernel: [  258.706679] btusb_bulk_complete: hci0 urb ffff88009f76e600 failed to resubmit (1)
Nov 20 21:26:19 ace kernel: [  258.707679] btusb_bulk_complete: hci0 urb ffff88009f76e780 failed to resubmit (1)
Nov 20 21:26:22 ace kernel: [  261.712198] uhci_hcd 0000:00:1d.2: PCI INT D disabled
Nov 20 21:26:23 ace kernel: [  261.807150] uhci_hcd 0000:00:1d.2: PCI INT D -> GSI 16 (level, low) -> IRQ 16
Nov 20 21:26:23 ace kernel: [  261.807217] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Nov 20 21:26:23 ace kernel: [  261.807277] uhci_hcd 0000:00:1a.0: PCI INT A disabled
Nov 20 21:26:39 ace kernel: [  278.709832] btusb_intr_complete: hci0 urb ffff8800971f80c0 failed to resubmit (1)
Nov 20 21:26:39 ace kernel: [  278.710828] btusb_bulk_complete: hci0 urb ffff8800971f8300 failed to resubmit (1)
Nov 20 21:26:39 ace kernel: [  278.711829] btusb_bulk_complete: hci0 urb ffff8800971f8f00 failed to resubmit (1)
Nov 20 21:26:42 ace kernel: [  281.712159] pci_pm_runtime_suspend(): hcd_pci_runtime_suspend+0x0/0x43 returns -16
Comment 6 Andy Lawrence 2010-11-21 17:29:59 EST
Looks like spoke too soon, on the 4th suspend/resume cycle the choppy mouse came back.

/proc/interrupts from 2.6.37-0.1.rc2.git0.fc15.x86_64

           CPU0       CPU1       
  0:    2728964       7904   IO-APIC-edge      timer
  1:       2023         62   IO-APIC-edge      i8042
  8:          0          1   IO-APIC-edge      rtc0
  9:      15203        596   IO-APIC-fasteoi   acpi
 12:     100774       1884   IO-APIC-edge      i8042
 16:     197912       2089   IO-APIC-fasteoi   uhci_hcd:usb3, uhci_hcd:usb6
 19:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
 23:         82         45   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb4
 40:     117322       5669   PCI-MSI-edge      ahci
 41:       7556      21168   PCI-MSI-edge      i915@pci:0000:00:02.0
 42:     779653        657   PCI-MSI-edge      iwlagn
 43:        117        171   PCI-MSI-edge      hda_intel
 44:          0          0   PCI-MSI-edge      eth0
NMI:         12        524   Non-maskable interrupts
LOC:    1337511    2071033   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:         12        524   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RES:     177447     168409   Rescheduling interrupts
CAL:       7878       7200   Function call interrupts
TLB:      27494      76154   TLB shootdowns
TRM:     613395     613395   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:         39         37   Machine check polls
ERR:         11
MIS:          0

I'm happy to boot a debug kernel or whatever is necessary to help.

Thanks
Comment 7 Kyle McMartin 2010-11-24 15:41:28 EST
Thanks!

Can you attach the output of lsusb -t/lsusb -v as root from before a suspend?

I suspect one of your USB drivers is probably not disabling interrupts correctly, so the handler goes away, but the interrupt is still occurring.

regards, Kyle
Comment 8 Kyle McMartin 2010-11-24 16:14:32 EST
Oh, hmm, I wonder if it's bluetooth. Can you try disabling bluetooth before you suspend and let me know if that helps any?
Comment 9 Andy Lawrence 2010-11-24 20:16:04 EST
See attached lsusb.log for output requested above.

I'm about 99.999% sure this was also happening on my USB mouse, I'll confirm.

me, leaves to suspend/resume with bluetooth disabled.
Comment 10 Andy Lawrence 2010-11-24 20:16:32 EST
Created attachment 462791 [details]
lsusb output
Comment 11 Andy Lawrence 2010-11-24 20:28:14 EST
Created attachment 462792 [details]
lsusb with bluetooth enabled this time
Comment 12 Andy Lawrence 2010-11-24 20:31:16 EST
I don't normally use the trackpad.  But since filling this bug I've been using it more.  I can say 100% that they trackpad does not work well, even before a suspend/resume on 2.6.36 and above.  Boot back to 2.6.35 and it works great everytime.
Comment 13 Andy Lawrence 2010-11-25 20:08:42 EST
Took 10 or cycles but it finally happened with bluetooth disabled and a normal USB mouse.

Incase this helps again, this is without bluetooth:

[root@ace tools]# cat /proc/interrupts 
           CPU0       CPU1       
  0:    6231250      64117   IO-APIC-edge      timer
  1:       4126        466   IO-APIC-edge      i8042
  8:          1          0   IO-APIC-edge      rtc0
  9:      33373         98   IO-APIC-fasteoi   acpi
 12:     205083         78   IO-APIC-edge      i8042
 16:     299901        100   IO-APIC-fasteoi   uhci_hcd:usb3, uhci_hcd:usb6
 19:          3          1   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
 23:     240318       5216   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb4
 40:     213811       8476   PCI-MSI-edge      ahci
 41:      15301      35795   PCI-MSI-edge      i915@pci:0000:00:02.0
 42:    1740951        752   PCI-MSI-edge      iwlagn
 43:        119        292   PCI-MSI-edge      hda_intel
 44:          0          0   PCI-MSI-edge      eth0
NMI:         35       1342   Non-maskable interrupts
LOC:    3429414    5044312   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:         35       1342   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RES:     274323     351095   Rescheduling interrupts
CAL:      16559      17229   Function call interrupts
TLB:     102183     131453   TLB shootdowns
TRM:     338174     338170   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:         82         78   Machine check polls
ERR:         21
MIS:          0
[root@ace tools]# 

Thanks
Comment 14 Kyle McMartin 2010-11-26 09:04:02 EST
Damn, thanks, I'm not sure what to make of it.
Comment 15 Andy Lawrence 2010-11-26 09:20:51 EST
Yeah!  I'm here to help, not sure what else to provide.  I can get you in while its happening if that would do any good?
Comment 16 Kyle McMartin 2010-11-26 12:00:54 EST
I think there's something deeply wrong with how uhci is handling itself... Building a test kernel for you now, will update with the URL in a bit. (There's a similar thread about this issue (well, an IRQ flapping on UHCI that's causing a hang on shutdown, which would be similar to suspend/resume) on linux-kernel that has me curious.
Comment 17 Kyle McMartin 2010-11-26 12:08:34 EST
http://kyle.fedorapeople.org/kernel/2.6.36.1-9.bz645968.1.fc15/

Does this one help at all?
Comment 18 Andy Lawrence 2010-11-26 21:02:43 EST
Thanks Kyle, currently loaded and testing.  When/if the choppyness occurs, any particular output would you like to see?

Thanks
Comment 19 Kyle McMartin 2010-11-26 23:19:00 EST
Honestly I don't know, I'm hoping it won't occur. :)
Comment 20 Andy Lawrence 2010-11-27 17:10:45 EST
No such luck!  Here are a few logs that might help, these are post resume when choppy, kernel 2.6.36.1-9.bz645968.1.fc15.x86_64.

Thanks Kyle.

Nov 27 17:02:17 ace kernel: [37582.320520] uhci_hcd 0000:00:1d.2: PCI INT D disabled
Nov 27 17:02:17 ace kernel: [37582.320557] uhci_hcd 0000:00:1d.1: PCI INT B disabled
Nov 27 17:02:17 ace kernel: [37582.320571] uhci_hcd 0000:00:1d.0: PCI INT A disabled
Nov 27 17:02:17 ace kernel: [37582.320691] ehci_hcd 0000:00:1a.7: PCI INT D disabled
Nov 27 17:02:17 ace kernel: [37582.320706] uhci_hcd 0000:00:1a.0: PCI INT A disabled
Nov 27 17:02:17 ace kernel: [37582.331077] ehci_hcd 0000:00:1d.7: PCI INT A disabled
Nov 27 17:02:17 ace kernel: [37582.422119] HDA Intel 0000:00:1b.0: PCI INT A disabled
Nov 27 17:02:17 ace kernel: [37582.764049] PM: suspend of devices complete after 500.406 msecs
Nov 27 17:02:17 ace kernel: [37582.786100] PM: late suspend of devices complete after 22.043 msecs
Nov 27 17:02:17 ace kernel: [37582.786529] ACPI: Preparing to enter system sleep state S3
Nov 27 17:02:17 ace kernel: [37582.786676] PM: Saving platform NVS memory
Nov 27 17:02:17 ace kernel: [37582.815649] Disabling non-boot CPUs ...
Nov 27 17:02:17 ace kernel: [37582.817047] CPU 1 is now offline
Nov 27 17:02:17 ace kernel: [37582.817051] SMP alternatives: switching to UP code
Nov 27 17:02:17 ace kernel: [37582.823653] PM: Restoring platform NVS memory
Nov 27 17:02:17 ace kernel: [37582.823653] Enabling non-boot CPUs ...
Nov 27 17:02:17 ace kernel: [37582.823653] SMP alternatives: switching to SMP code
Nov 27 17:02:17 ace kernel: [37582.829891] Booting Node 0 Processor 1 APIC 0x1
Nov 27 17:02:17 ace kernel: [37582.900297] NMI watchdog enabled, takes one hw-pmu counter.
Nov 27 17:02:17 ace kernel: [37582.901072] CPU1 is up
Nov 27 17:02:17 ace kernel: [37582.901800] ACPI: Waking up from system sleep state S3
Nov 27 17:02:17 ace kernel: [37583.035976] PM: early resume of devices complete after 1.324 msecs
Nov 27 17:02:17 ace kernel: [37583.037192] [drm:init_ring_common] *ERROR* render ring head not reset to zero ctl 00000000 head 02001000 tail 00000000 start 02001000
Nov 27 17:02:17 ace kernel: [37583.037200] [drm:init_ring_common] *ERROR* render ring head forced to zero ctl 00000000 head 00000000 tail 00000000 start 02001000
Nov 27 17:02:17 ace kernel: [37583.054447] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Nov 27 17:02:17 ace kernel: [37583.054489] usb usb3: root hub lost power or was reset
Nov 27 17:02:17 ace kernel: [37583.054508] ehci_hcd 0000:00:1a.7: PCI INT D -> GSI 19 (level, low) -> IRQ 19
Nov 27 17:02:17 ace kernel: [37583.054564] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
Nov 27 17:02:17 ace kernel: [37583.055205] pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
Nov 27 17:02:17 ace kernel: [37583.055226] pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
Nov 27 17:02:17 ace kernel: [37583.055244] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
Nov 27 17:02:17 ace kernel: [37583.055285] usb usb4: root hub lost power or was reset
Nov 27 17:02:17 ace kernel: [37583.055302] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
Nov 27 17:02:17 ace kernel: [37583.055342] usb usb5: root hub lost power or was reset
Nov 27 17:02:17 ace kernel: [37583.055358] uhci_hcd 0000:00:1d.2: PCI INT D -> GSI 16 (level, low) -> IRQ 16
Nov 27 17:02:17 ace kernel: [37583.055397] usb usb6: root hub lost power or was reset
Nov 27 17:02:17 ace kernel: [37583.055413] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
Nov 27 17:02:17 ace kernel: [37583.055927] sd 0:0:0:0: [sda] Starting disk
Nov 27 17:02:17 ace kernel: [37583.184096] atl1c 0000:01:00.0: MAC state machine can't be idle since disabled for 10ms second
Nov 27 17:02:17 ace kernel: [37583.184153] atl1c 0000:01:00.0: Error get phy ID
Nov 27 17:02:17 ace kernel: [37583.278046] usb 2-5: reset high speed USB device using ehci_hcd and address 2
Nov 27 17:02:17 ace kernel: [37583.362054] ata2: SATA link down (SStatus 0 SControl 300)
Nov 27 17:02:17 ace kernel: [37583.364045] ata5: SATA link down (SStatus 0 SControl 300)
Nov 27 17:02:17 ace kernel: [37583.366054] ata6: SATA link down (SStatus 0 SControl 300)
Nov 27 17:02:17 ace kernel: [37583.644037] usb 3-1: reset low speed USB device using uhci_hcd and address 2
Nov 27 17:02:17 ace kernel: [37585.657137] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov 27 17:02:17 ace kernel: [37585.663079] ata1.00: configured for UDMA/133
Nov 27 17:02:17 ace kernel: [37585.694024] PM: resume of devices complete after 2657.955 msecs
Nov 27 17:02:17 ace kernel: [37585.694572] Restarting tasks ... done.


[root@ace andy]# cat /proc/interrupts 
           CPU0       CPU1       
  0:   10318896      25625   IO-APIC-edge      timer
  1:       5584         18   IO-APIC-edge      i8042
  8:          1          0   IO-APIC-edge      rtc0
  9:      56820         96   IO-APIC-fasteoi   acpi
 12:      81702         87   IO-APIC-edge      i8042
 16:     299984         17   IO-APIC-fasteoi   uhci_hcd:usb3, uhci_hcd:usb6
 19:          1          1   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5
 23:      36732         22   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb4
 40:     143256       5478   PCI-MSI-edge      ahci
 41:      10617      41510   PCI-MSI-edge      i915@pci:0000:00:02.0
 42:        107        379   PCI-MSI-edge      hda_intel
 43:    2739358        645   PCI-MSI-edge      iwlagn
 44:          0          0   PCI-MSI-edge      eth0
NMI:         18        913   Non-maskable interrupts
LOC:    1906411    5049332   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:         18        913   Performance monitoring interrupts
PND:          0          0   Performance pending work
RES:     144250     182852   Rescheduling interrupts
CAL:      19833      18614   Function call interrupts
TLB:      33940      82035   TLB shootdowns
TRM:     197146     197140   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        133        129   Machine check polls
ERR:         13
MIS:          0
[root@ace andy]# 

[37582.320520] uhci_hcd 0000:00:1d.2: PCI INT D disabled
[37582.320557] uhci_hcd 0000:00:1d.1: PCI INT B disabled
[37582.320571] uhci_hcd 0000:00:1d.0: PCI INT A disabled
[37582.320691] ehci_hcd 0000:00:1a.7: PCI INT D disabled
[37582.320706] uhci_hcd 0000:00:1a.0: PCI INT A disabled
[37582.331077] ehci_hcd 0000:00:1d.7: PCI INT A disabled
[37582.422119] HDA Intel 0000:00:1b.0: PCI INT A disabled
[37582.422172] ACPI handle has no context!
[37582.764049] PM: suspend of devices complete after 500.406 msecs
[37582.786100] PM: late suspend of devices complete after 22.043 msecs
[37582.786529] ACPI: Preparing to enter system sleep state S3
[37582.786676] PM: Saving platform NVS memory
[37582.815649] Disabling non-boot CPUs ...
[37582.817047] CPU 1 is now offline
[37582.817051] SMP alternatives: switching to UP code
[37582.823653] Back to C!
[37582.823653] PM: Restoring platform NVS memory
[37582.823653] Enabling non-boot CPUs ...
[37582.823653] SMP alternatives: switching to SMP code
[37582.829891] Booting Node 0 Processor 1 APIC 0x1
[37582.900297] NMI watchdog enabled, takes one hw-pmu counter.
[37582.901072] CPU1 is up
[37582.901800] ACPI: Waking up from system sleep state S3
[37583.034685] agpgart-intel 0000:00:00.0: restoring config space at offset 0x1 (was 0x20900006, writing 0x30900006)
[37583.034712] i915 0000:00:02.0: restoring config space at offset 0x6 (was 0xc, writing 0xc000000c)
[37583.034722] i915 0000:00:02.0: restoring config space at offset 0x1 (was 0x900007, writing 0x900407)
[37583.034797] uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2900005, writing 0x2900001)
[37583.034855] ehci_hcd 0000:00:1a.7: restoring config space at offset 0x1 (was 0x2900006, writing 0x2900002)
[37583.034916] HDA Intel 0000:00:1b.0: restoring config space at offset 0x4 (was 0x4, writing 0xd4500004)
[37583.034924] HDA Intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10)
[37583.034933] HDA Intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100004, writing 0x100002)
[37583.034973] pci 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff)
[37583.034989] pci 0000:00:1c.0: restoring config space at offset 0x9 (was 0x10001, writing 0xd131d041)
[37583.034997] pci 0000:00:1c.0: restoring config space at offset 0x8 (was 0x0, writing 0xd440d350)
[37583.035015] pci 0000:00:1c.0: restoring config space at offset 0x7 (was 0x20000000, writing 0x20002020)
[37583.035028] pci 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010)
[37583.035038] pci 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007)
[37583.035080] pci 0000:00:1c.3: restoring config space at offset 0xf (was 0x400, writing 0x4ff)
[37583.035096] pci 0000:00:1c.3: restoring config space at offset 0x9 (was 0x10001, writing 0xd231d141)
[37583.035104] pci 0000:00:1c.3: restoring config space at offset 0x8 (was 0x0, writing 0xd340d250)
[37583.035112] pci 0000:00:1c.3: restoring config space at offset 0x7 (was 0x20000000, writing 0x20001010)
[37583.035125] pci 0000:00:1c.3: restoring config space at offset 0x3 (was 0x810000, writing 0x810010)
[37583.035135] pci 0000:00:1c.3: restoring config space at offset 0x1 (was 0x100000, writing 0x100007)
[37583.035193] uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2900005, writing 0x2900001)
[37583.035242] uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2900005, writing 0x2900001)
[37583.035291] uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2900005, writing 0x2900001)
[37583.035347] ehci_hcd 0000:00:1d.7: restoring config space at offset 0x1 (was 0x2900006, writing 0x2900002)
[37583.035389] pci 0000:00:1e.0: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1)
[37583.035397] pci 0000:00:1e.0: restoring config space at offset 0x8 (was 0x0, writing 0xfff0)
[37583.035405] pci 0000:00:1e.0: restoring config space at offset 0x7 (was 0x22800000, writing 0x228000f0)
[37583.035421] pci 0000:00:1e.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007)
[37583.035533] ahci 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00007, writing 0x2b00407)
[37583.035699] atl1c 0000:01:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10b)
[37583.035739] atl1c 0000:01:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x180000)
[37583.035810] iwlagn 0000:02:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10b)
[37583.035851] iwlagn 0000:02:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xd2500004)
[37583.035868] iwlagn 0000:02:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100406)
[37583.035976] PM: early resume of devices complete after 1.324 msecs
[37583.036384] i915 0000:00:02.0: setting latency timer to 64
[37583.037192] [drm:init_ring_common] *ERROR* render ring head not reset to zero ctl 00000000 head 02001000 tail 00000000 start 02001000
[37583.037200] [drm:init_ring_common] *ERROR* render ring head forced to zero ctl 00000000 head 00000000 tail 00000000 start 02001000
[37583.054447] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[37583.054458] uhci_hcd 0000:00:1a.0: setting latency timer to 64
[37583.054489] usb usb3: root hub lost power or was reset
[37583.054508] ehci_hcd 0000:00:1a.7: PCI INT D -> GSI 19 (level, low) -> IRQ 19
[37583.054518] ehci_hcd 0000:00:1a.7: setting latency timer to 64
[37583.054564] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[37583.054574] HDA Intel 0000:00:1b.0: setting latency timer to 64
[37583.054628] HDA Intel 0000:00:1b.0: irq 42 for MSI/MSI-X
[37583.055205] pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[37583.055213] pci 0000:00:1c.0: setting latency timer to 64
[37583.055226] pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
[37583.055233] pci 0000:00:1c.3: setting latency timer to 64
[37583.055244] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
[37583.055255] uhci_hcd 0000:00:1d.0: setting latency timer to 64
[37583.055285] usb usb4: root hub lost power or was reset
[37583.055302] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[37583.055312] uhci_hcd 0000:00:1d.1: setting latency timer to 64
[37583.055342] usb usb5: root hub lost power or was reset
[37583.055358] uhci_hcd 0000:00:1d.2: PCI INT D -> GSI 16 (level, low) -> IRQ 16
[37583.055368] uhci_hcd 0000:00:1d.2: setting latency timer to 64
[37583.055397] usb usb6: root hub lost power or was reset
[37583.055413] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
[37583.055423] ehci_hcd 0000:00:1d.7: setting latency timer to 64
[37583.055489] pci 0000:00:1e.0: setting latency timer to 64
[37583.055506] ahci 0000:00:1f.2: setting latency timer to 64
[37583.055927] sd 0:0:0:0: [sda] Starting disk
[37583.184096] atl1c 0000:01:00.0: MAC state machine can't be idle since disabled for 10ms second
[37583.184153] atl1c 0000:01:00.0: Error get phy ID
[37583.278046] usb 2-5: reset high speed USB device using ehci_hcd and address 2
[37583.362054] ata2: SATA link down (SStatus 0 SControl 300)
[37583.364045] ata5: SATA link down (SStatus 0 SControl 300)
[37583.366054] ata6: SATA link down (SStatus 0 SControl 300)
[37583.644037] usb 3-1: reset low speed USB device using uhci_hcd and address 2
[37585.657137] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[37585.663079] ata1.00: configured for UDMA/133
[37585.694024] PM: resume of devices complete after 2657.955 msecs
[37585.694569] PM: Finishing wakeup.
[37585.694572] Restarting tasks ... done.
[37585.695157] video LNXVIDEO:00: Restoring backlight state
[37586.142798] atl1c 0000:01:00.0: irq 44 for MSI/MSI-X
[37586.239244] atl1c 0000:01:00.0: Error get phy ID
Comment 21 Andy Lawrence 2010-11-27 17:12:13 EST
Sorry, one more thing, blueetooth was disabled on comment 20 above.

Thanks
Comment 22 Kyle McMartin 2010-11-29 09:06:09 EST
Hi Andy,

Can you try powertop2 and let me know if the USB devices are at 100% in the display? I have a feeling this is going to be related to bug https://bugzilla.redhat.com/show_bug.cgi?id=638912.

http://kyle.fedorapeople.org/powertop-1.13-1.20101126.1.fc15.x86_64.rpm

THanks!
 Kyle
Comment 23 Kyle McMartin 2010-11-29 09:12:07 EST
Can you

watch /sys/firmware/acpi/interrupts/gpe_all

and see how fast it is rising?

If its rising very quickly at runtime, poke at the gpe_ files and try to figure out which one it is?

regards, Kyle
Comment 24 Andy Lawrence 2010-11-29 16:31:40 EST
Created attachment 463586 [details]
powertop screenshot 1
Comment 25 Andy Lawrence 2010-11-29 16:32:03 EST
Created attachment 463587 [details]
powertop screenshot 2
Comment 26 Andy Lawrence 2010-11-29 16:35:39 EST
Kyle, the powertop screenshots were done with the system running normally, but they do show USB at 100%, let me know if you need the same after a choppy suspend/resume.

gpe_all is increasing right at 90 per minute. I will continue to poke around and see if I can figure out which one is doing it.

Also, this was done with kernel 2.6.36.1-9.bz645968.1.fc15.x86_64, let me know if it needs to be something else.

Thanks
Comment 27 Andy Lawrence 2010-11-29 16:42:33 EST
Looks like the only ones that are increasing are gpe18 and sci?
Comment 28 Kyle McMartin 2010-11-29 19:34:33 EST
Hrm, 90 per minute doesn't really quality as 'screaming' I don't think. :(
Darn.
Comment 29 Andy Lawrence 2010-12-02 17:23:05 EST
Just in case it helps, Dave asked for this output on Bug 638912, attached is for normal operation before suspend/resume.  I'll attach the same choppy next time it happens.

perf record -a -e workqueue:workqueue_execute_start sleep 1

Thanks
Comment 30 Andy Lawrence 2010-12-02 17:23:55 EST
Created attachment 464392 [details]
perf output normal no chop
Comment 31 Kyle McMartin 2010-12-02 23:48:08 EST
Thanks for being proactive, I was going to ask you for this next time I passed over my bugzilla browser tabs. :)
Comment 32 Andy Lawrence 2010-12-03 16:21:35 EST
Created attachment 464655 [details]
perf output while choppy
Comment 33 Kyle McMartin 2010-12-09 13:02:55 EST
Andy,

Any chance you could try booting with pcie_ports=compat and letting me know if it fixes it? It sounds somewhat related to #635813 and #638912.

regards, Kyle
Comment 34 Andy Lawrence 2010-12-09 18:58:36 EST
(In reply to comment #33)
> Andy,
> 
> Any chance you could try booting with pcie_ports=compat and letting me know if
> it fixes it? It sounds somewhat related to #635813 and #638912.
> 
> regards, Kyle

No luck with pcie_ports=compat, however this was with kernel kernel-2.6.37-0.rc4.git0.1.fc15.x86_64, let me know if I need to retest with kernel-2.6.36.1-9.bz645968.1.fc15.x86_64.

Thanks!
Comment 35 Andy Lawrence 2010-12-11 07:29:53 EST
I also tried 2.6.37-0.rc5.git2.1.bz638912.fc15.x86_64 with no success.
Comment 36 Andy Lawrence 2011-01-24 19:39:35 EST
After jumping back on the rawhide wagon I'v been unable to reproduce this, and it's been 50+ suspend/resume cycles!

Kyle appreciate all the help, closing this bug as I was the only culprit and its now gone.

Thanks
Andy
Comment 37 Andy Lawrence 2011-02-11 16:28:07 EST
This either never went away or has returned with the 2.6.38 kernels, filed upstream.

Thanks!
Comment 38 Jonathan Protzenko 2011-04-20 18:35:36 EDT
Hi,

1) does unloading the ehci_hcd module and loading it again solve your issue (modprobe -r ehci_hcd && modprobe ehci_hcd) and,
2) do the symptoms change in 2.6.38 and,
3) your last comment seems to implied you filed upstream, could you give a link to the kernel bugzilla?

Thanks!
Comment 39 Andy Lawrence 2011-04-20 18:47:08 EDT
Hi Jonathan

I will try #1 next time it happens, but I now have a nice new Sandy Bride laptop and don't use the old one too much.

The symptoms didn't seem to change in .38.

If you scroll to the top of this bug you'll find the external tracker box pointing to the upstream kernel tracker.

Are you having the same issue?
Comment 40 Jonathan Protzenko 2011-04-22 18:12:37 EDT
Yes, and I was able to pinpoint it down to usb autosuspend issues, which unloading/reloading the module fixes, so I thought this might help. The newer versions of the powertop utility allow you to disable usb autosuspend for your mouse. Might help for other people :-).

Note You need to log in before you can comment on or make changes to this bug.