Bug 839733 - "IRQ 19 might be stuck. Polling" entries in /var/log/messages
"IRQ 19 might be stuck. Polling" entries in /var/log/messages
Status: CLOSED DUPLICATE of bug 755956
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
17
x86_64 Linux
unspecified Severity low
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-12 12:52 EDT by Paweł Brodacki
Modified: 2012-07-12 15:08 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-12 14:16:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Output of dmesg (89.44 KB, text/plain)
2012-07-12 12:52 EDT, Paweł Brodacki
no flags Details
Output of cat /proc/interrupts (1.66 KB, text/plain)
2012-07-12 12:53 EDT, Paweł Brodacki
no flags Details
Output of lspci (2.55 KB, text/plain)
2012-07-12 12:53 EDT, Paweł Brodacki
no flags Details

  None (edit)
Description Paweł Brodacki 2012-07-12 12:52:21 EDT
Created attachment 597846 [details]
Output of dmesg

Description of problem:
It seems that I have been bitten by ASM1083 bug (described e.g. in this thread: https://lkml.org/lkml/2012/1/30/216). I was unable to find Bugzilla entry for this chip or Asus E45M1-M Pro board, which bears this chip and which I bought.

Within an hour of the boot in /var/log/messages start appearing lines of
IRQ 19 might be stuck.  Polling
After the first occurrence they will re-appear separated by a couple of seconds to couple of hours. The frequency of log entries seems to correlate with amount of traffic on the network, which seems reasonable, as IRQ 19 is servicing a network card.

Version-Release number of selected component (if applicable):


How reproducible:
For the last week the bug occurs at least every couple of hours.


Steps to Reproduce:
1. Install a NIC into PCI slot of Asus E45M1-M Pro board.
2. Have traffic through the NIC.
3.
  
Actual results:
IRQ 19 might be stuck.  Polling
entries in /var/log/messages

Expected results:
No stuck interrupts reported.

Additional info:

The LKML thread points at the problem within the ASM1083 chip itself, so I do not expect miracles, but I'm going to wait for one anyhow. ;)

I would also like to request confirmation, that ditching the PCI NIC and replacing it with one using PCI Express bus should eliminate the problem.

I'm creating this Bugzilla entry also to help people decide when choosing hardware to buy. Voting with money works, and Asus E45M1-M Pro currently uses a faulty chip. My recommendation is to avoid this board and any other that uses the problematic ASM1083 chip.
Comment 1 Paweł Brodacki 2012-07-12 12:53:20 EDT
Created attachment 597851 [details]
Output of cat /proc/interrupts
Comment 2 Paweł Brodacki 2012-07-12 12:53:47 EDT
Created attachment 597853 [details]
Output of lspci
Comment 3 Josh Boyer 2012-07-12 14:16:54 EDT
(In reply to comment #0)
> Actual results:
> IRQ 19 might be stuck.  Polling
> entries in /var/log/messages
> 
> Expected results:
> No stuck interrupts reported.
> 
> Additional info:
> 
> The LKML thread points at the problem within the ASM1083 chip itself, so I
> do not expect miracles, but I'm going to wait for one anyhow. ;)

You already have the closest thing we've come to a fix for the issue.  We carry a patch called unhandled-irqs-switch-to-polling.patch which does the automatic switching of stuck interrupts to polling mode for just a bit and then goes back to regular operation.  That is why you see polling messages.  Without that patch, the kernel would mark IRQ 19 as stuck entirely and render everything that has that interrupt assigned to it useless.

Not a miracle, but at least your box remains reasonably functional.

> I would also like to request confirmation, that ditching the PCI NIC and
> replacing it with one using PCI Express bus should eliminate the problem.

Quite possibly, yes.  I don't see anything else in your dmesg that would be behind the ASM bridge.

> I'm creating this Bugzilla entry also to help people decide when choosing
> hardware to buy. Voting with money works, and Asus E45M1-M Pro currently
> uses a faulty chip. My recommendation is to avoid this board and any other
> that uses the problematic ASM1083 chip.

We already have a bug that covered this and leaving this one open isn't really going to change anything.  We'll duplicate this bug to the original.  We appreciate the report though.

*** This bug has been marked as a duplicate of bug 755956 ***
Comment 4 Paweł Brodacki 2012-07-12 14:40:35 EDT
Thanks for the explanation.

Note You need to log in before you can comment on or make changes to this bug.