Bug 839733 - "IRQ 19 might be stuck. Polling" entries in /var/log/messages
Summary: "IRQ 19 might be stuck. Polling" entries in /var/log/messages
Keywords:
Status: CLOSED DUPLICATE of bug 755956
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 17
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-12 16:52 UTC by Paweł Brodacki
Modified: 2012-07-12 19:08 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-12 18:16:54 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Output of dmesg (89.44 KB, text/plain)
2012-07-12 16:52 UTC, Paweł Brodacki
no flags Details
Output of cat /proc/interrupts (1.66 KB, text/plain)
2012-07-12 16:53 UTC, Paweł Brodacki
no flags Details
Output of lspci (2.55 KB, text/plain)
2012-07-12 16:53 UTC, Paweł Brodacki
no flags Details

Description Paweł Brodacki 2012-07-12 16:52:21 UTC
Created attachment 597846 [details]
Output of dmesg

Description of problem:
It seems that I have been bitten by ASM1083 bug (described e.g. in this thread: https://lkml.org/lkml/2012/1/30/216). I was unable to find Bugzilla entry for this chip or Asus E45M1-M Pro board, which bears this chip and which I bought.

Within an hour of the boot in /var/log/messages start appearing lines of
IRQ 19 might be stuck.  Polling
After the first occurrence they will re-appear separated by a couple of seconds to couple of hours. The frequency of log entries seems to correlate with amount of traffic on the network, which seems reasonable, as IRQ 19 is servicing a network card.

Version-Release number of selected component (if applicable):


How reproducible:
For the last week the bug occurs at least every couple of hours.


Steps to Reproduce:
1. Install a NIC into PCI slot of Asus E45M1-M Pro board.
2. Have traffic through the NIC.
3.
  
Actual results:
IRQ 19 might be stuck.  Polling
entries in /var/log/messages

Expected results:
No stuck interrupts reported.

Additional info:

The LKML thread points at the problem within the ASM1083 chip itself, so I do not expect miracles, but I'm going to wait for one anyhow. ;)

I would also like to request confirmation, that ditching the PCI NIC and replacing it with one using PCI Express bus should eliminate the problem.

I'm creating this Bugzilla entry also to help people decide when choosing hardware to buy. Voting with money works, and Asus E45M1-M Pro currently uses a faulty chip. My recommendation is to avoid this board and any other that uses the problematic ASM1083 chip.

Comment 1 Paweł Brodacki 2012-07-12 16:53:20 UTC
Created attachment 597851 [details]
Output of cat /proc/interrupts

Comment 2 Paweł Brodacki 2012-07-12 16:53:47 UTC
Created attachment 597853 [details]
Output of lspci

Comment 3 Josh Boyer 2012-07-12 18:16:54 UTC
(In reply to comment #0)
> Actual results:
> IRQ 19 might be stuck.  Polling
> entries in /var/log/messages
> 
> Expected results:
> No stuck interrupts reported.
> 
> Additional info:
> 
> The LKML thread points at the problem within the ASM1083 chip itself, so I
> do not expect miracles, but I'm going to wait for one anyhow. ;)

You already have the closest thing we've come to a fix for the issue.  We carry a patch called unhandled-irqs-switch-to-polling.patch which does the automatic switching of stuck interrupts to polling mode for just a bit and then goes back to regular operation.  That is why you see polling messages.  Without that patch, the kernel would mark IRQ 19 as stuck entirely and render everything that has that interrupt assigned to it useless.

Not a miracle, but at least your box remains reasonably functional.

> I would also like to request confirmation, that ditching the PCI NIC and
> replacing it with one using PCI Express bus should eliminate the problem.

Quite possibly, yes.  I don't see anything else in your dmesg that would be behind the ASM bridge.

> I'm creating this Bugzilla entry also to help people decide when choosing
> hardware to buy. Voting with money works, and Asus E45M1-M Pro currently
> uses a faulty chip. My recommendation is to avoid this board and any other
> that uses the problematic ASM1083 chip.

We already have a bug that covered this and leaving this one open isn't really going to change anything.  We'll duplicate this bug to the original.  We appreciate the report though.

*** This bug has been marked as a duplicate of bug 755956 ***

Comment 4 Paweł Brodacki 2012-07-12 18:40:35 UTC
Thanks for the explanation.


Note You need to log in before you can comment on or make changes to this bug.