Bug 433667 - [RHEL5 U2] Kernel forcedeth driver message
Summary: [RHEL5 U2] Kernel forcedeth driver message
Keywords:
Status: CLOSED DUPLICATE of bug 428696
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Andy Gospodarek
QA Contact: Martin Jenner
URL: http://rhts.redhat.com/cgi-bin/rhts/t...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-02-20 19:11 UTC by Jeff Burke
Modified: 2014-06-29 22:59 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-09-10 16:48:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Full log (639.75 KB, text/plain)
2008-02-20 19:11 UTC, Jeff Burke
no flags Details
RHEL5.2 Forcedeth Failure (157.89 KB, text/plain)
2008-07-18 09:29 UTC, Marcin Kowalski
no flags Details

Description Jeff Burke 2008-02-20 19:11:13 UTC
Description of problem:
 When running on hp-xw9400-01.rhts.redhat.com system spews messages.
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.

Version-Release number of selected component (if applicable):
kernel-debug
2.6.18-81.el5
2.6.18-53.1.10.el5

How reproducible:
 Always

Steps to Reproduce:
1. Install RHEL5.U1 on hp-xw9400-01.rhts.redhat.com 
2. Install the kernel-debug variant reboot
  
Actual results:
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.
eth0: too many iterations (6) in nv_nic_irq.
NETDEV WATCHDOG: eth0: transmit timed out
eth0: Got tx_timeout. irq: 00000036
eth0: Ring at 36b4e000
eth0: Dumping tx registers
  0: 00002036 000000ff 00000003 007f03ca 00000000 00000000 00000000 00000000
 20: 00000000 00000000 00000000 00000000 00000001 00000100 00000000 00000000
 40: 0420e20e 0000a855 00002e20 00000000 00000000 00000000 00000000 00000000
 60: 00000000 00000000 00000000 0000ffff 0000ffff 0000ffff 0000ffff 00000000
 80: 003b0f3c 40000001 00000000 007f0088 0000061c 00000001 00000000 00007fa3
 a0: 0014050f 00000016 26fe1800 000040bd 00000001 00000000 00000000 00000000
 c0: 10000002 00000001 00000001 00000001 00000001 00000001 00000001 00000001
 e0: 00000001 00000001 00000001 00000001 00000001 00000001 00000001 00000001
100: 36b4e800 36b4e000 007f00ff 00008000 00010032 00000000 0000001b 36b4f110
120: 36b4e530 2c254dc0 a000ffef 00000000 00000000 36b4f11c 36b4e530 0fe08000
140: 00304120 80c02600 00000000 00000000 00000000 00000000 00000000 00000000
160: 00000000 00000000 00000000 00000000 01ff0080 0000c000 00000000 00000000
180: 00000006 00000008 00947969 00008103 0000000a 00003800 00000080 0000b983
1a0: 0000000e 00000008 0094796d 00008103 0000000a 00003800 000000b0 0000b9a3
1c0: 0000000e 00000008 0094796d 00008103 0000000a 00003800 000000b0 0000b9a3
1e0: 0000000e 00000008 0094796d 00008103 0000000a 00003800 000000b0 0000b9a3
200: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
220: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
240: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
260: 00000000 00000000 fe027001 00000100 00000011 000000a3 fe027011 000001a3
280: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
2a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
2c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
2e0: 00000000 00000000 00000000 00000000 00000000 00000001 00000001 00000001
300: 80212000 00000000 00000000 00000000 00000000 00002000 00000000 00000000
320: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
340: 00000000 00000000 00000000 00000000 00000000 00000020 01442646 00000000
360: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
380: 00000000 00000000 00000000 00000000 00000000 00000000 00000002 00000000
3a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3e0: 06255300 00701365 00000000 00000000 00000032 00000000 00000000 00000000
400: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
420: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
440: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
460: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
480: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
4a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
4c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
4e0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
500: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
520: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
540: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
560: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
580: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5e0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
600: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
eth0: Dumping tx ring
000: 00000000 36d5492e 20000040 // 00000000 36d57456 20000040 // 00000000
1fd5a662 20000040 // 00000000 36d7434a 20000040
004: 00000000 36d5e96e 20000040 // 00000000 0a09e4d6 20000040 // 00000000
1df4a762 20000040 // 00000000 36d72d06 20000040
008: 00000000 36d57662 20000040 // 00000000 36d54516 20000040 // 00000000
36d59d06 20000040 // 00000000 36d524d6 20000040
00c: 00000000 36d4a03e 20000040 // 00000000 36d7496e 20000040 // 00000000
129bb30a 20000040 // 00000000 0a09e6e2 20000040
010: 00000000 36d7a722 20000040 // 00000000 36d7ab3a 20000040 // 00000000
1fd5a456 20000040 // 00000000 11fba34a 20000040
014: 00000000 36d5807e 20000040 // 00000000 0a09e8ee 20000040 // 00000000
129bb92e 20000040 // 00000000 0a09eafa 20000040
018: 00000000 36d58aba 20000040 // 00000000 129bbd46 20000040 // 00000000
129bb516 20000040 // 00000000 129bb722 20000040
01c: 00000000 129bb0fe 20000040 // 00000000 37d66762 20000040 // 00000000
36d5786e 20000040 // 00000000 36d778ae 20000040
020: 00000000 36d55d86 20000040 // 00000000 36d5b722 20000040 // 00000000
1df4ab7a 20000040 // 00000000 1df4a34a 20000040
024: 00000000 129bbb3a 20000040 // 00000000 1df4a556 20000040 // 00000000
36d594d6 20000040 // 00000000 1df4ad86 20000040
028: 00000000 36d5e13e 20000040 // 00000000 36d5eb7a 20000040 // 00000000
36d4e96e 20000040 // 00000000 36d4eb7a 20000040
02c: 00000000 36d7b96e 20000040 // 00000000 0a09e2ca 20000040 // 00000000
323aacc6 20000040 // 00000000 323aa6a2 20000040
030: 00000000 36d516a2 20000040 // 00000000 0cb3524a 20000040 // 00000000
0cb35662 20000040 // 00000000 36d55556 20000040
034: 00000000 36d720be 20000040 // 00000000 36d728ee 20000040 // 00000000
323aa496 20000040 // 00000000 36d73516 20000040
038: 00000000 36d7392e 20000040 // 00000000 36d5534a 20000040 // 00000000
36d5513e 20000040 // 00000000 323aa28a 20000040
03c: 00000000 12137c86 20000040 // 00000000 2113dd06 20000040 // 00000000
2113dafa 20000040 // 00000000 0cb3586e 20000040
040: 00000000 2113d6e2 20000040 // 00000000 2113d4d6 20000040 // 00000000
2113d8ee 20000040 // 00000000 36d5024a 20000040
044: 00000000 2113d0be 20000040 // 00000000 36d50c86 20000040 // 00000000
323aa07e 20000040 // 00000000 1213703e 20000040
048: 00000000 2113d2ca 20000040 // 00000000 29fc3b3a 20000040 // 00000000
29fc392e 20000040 // 00000000 29fc3d46 20000040
04c: 00000000 29fc3516 20000040 // 00000000 29fc330a 20000040 // 00000000
1fd5aa7a 20000040 // 00000000 1fd5ac86 20000040
050: 00000000 29fc3722 20000040 // 00000000 1fd5a86e 20000040 // 00000000
10a11d86 20000040 // 00000000 10a11b7a 20000040
054: 00000000 136c3c86 20000040 // 00000000 29fc30fe 20000040 // 00000000
1fd5a03e 20000040 // 00000000 10a11762 20000040
058: 00000000 136c386e 20000040 // 00000000 10a11556 20000040 // 00000000
10a1196e 20000040 // 00000000 10a1113e 20000040
05c: 00000000 136c3a7a 20000040 // 00000000 136c3456 20000040 // 00000000
136c324a 20000040 // 00000000 10a1134a 20000040
060: 00000000 2a987cc6 20000040 // 00000000 136c3662 20000040 // 00000000
13a50d06 20000040 // 00000000 136c303e 20000040
064: 00000000 13a508ee 20000040 // 00000000 13a506e2 20000040 // 00000000
13a50afa 20000040 // 00000000 13a502ca 20000040
068: 00000000 13a500be 20000040 // 00000000 1c3a4d46 20000040 // 00000000
1c3a4b3a 20000040 // 00000000 13a504d6 20000040
06c: 00000000 1c3a492e 20000040 // 00000000 2a98728a 20000040 // 00000000
2a987496 20000040 // 00000000 2a98707e 20000040
070: 00000000 2a9878ae 20000040 // 00000000 2a9876a2 20000040 // 00000000
1c3a430a 20000040 // 00000000 1c3a40fe 20000040
074: 00000000 1c3a4722 20000040 // 00000000 2a987aba 20000040 // 00000000
34e9ec86 20000040 // 00000000 34e9ea7a 20000040
078: 00000000 34e9e86e 20000040 // 00000000 34e9e662 20000040 // 00000000
34e9e03e 20000040 // 00000000 0a265aba 20000040
07c: 00000000 0a2658ae 20000040 // 00000000 0a2656a2 20000040 // 00000000
0a265cc6 20000040 // 00000000 0a26528a 20000040
080: 00000000 0a26507e 20000040 // 00000000 0a265496 20000040 // 00000000
1c3a4516 20000040 // 00000000 34e9e24a 20000040
084: 00000000 34e9e456 20000040 // 00000000 271f5afa 20000040 // 00000000
271f58ee 20000040 // 00000000 05b4fb3a 20000040
088: 00000000 05b4f92e 20000040 // 00000000 05b4f722 20000040 // 00000000
05b4fd46 20000040 // 00000000 05b4f0fe 20000040
08c: 00000000 05b4f30a 20000040 // 00000000 271f5d06 20000040 // 00000000
271f52ca 20000040 // 00000000 271f54d6 20000040
090: 00000000 2c254d86 20000040 // 00000000 0c128000 2000048e // 00000000
36d5e7fe 20000046 // 00000000 36d60346 00000000
094: 00000000 0c12810c 20000bdc // 00000000 36d5979e 00000000 // 00000000
0c128c5c 00000000 // 00000000 32f87000 200011ca
098: 00000000 37c8f9ea 00000000 // 00000000 32f87d54 00000000 // 00000000
334c1000 2000048e // 00000000 36d52386 00000000
09c: 00000000 334c119c 20000bdc // 00000000 36d58346 00000000 // 00000000
334c1cec 00000000 // 00000000 0783e000 200011ca
0a0: 00000000 36d7481e 00000000 // 00000000 0783ede4 00000000 // 00000000
33ca8000 2000048e // 00000000 36d59bb6 00000000
0a4: 00000000 33ca822c 20000bdc // 00000000 36d57b36 00000000 // 00000000
33ca8d7c 00000000 // 00000000 1790a000 200011ca
0a8: 00000000 36d51346 00000000 // 00000000 1790ae74 00000000 // 00000000
19c43000 2000048e // 00000000 36d4ab36 00000000
0ac: 00000000 19c432bc 20000bdc // 00000000 37d9b13a 00000000 // 00000000
19c43e0c 00000000 // 00000000 16eb3000 200011ca
0b0: 00000000 11fbaa2a 00000000 // 00000000 16eb3f04 00000000 // 00000000
12a6d000 2000048e // 00000000 36d4e406 00000000
0b4: 00000000 0c12810c 200005ee // 00000000 37c8fbf6 00000000 // 00000000
0c12810c 200005ee // 00000000 0cb35456 2000004f
0b8: 00000000 36d4ad42 00000000 // 00000000 0c12810c 200005ee // 00000000
36d50b36 00000000 // 00000000 0c12810c 200005ee
0bc: 00000000 36d520be 2000004f // 00000000 36d76512 00000000 // 00000000
0c12810c 200005ee // 00000000 12137456 20000040
0c0: 00000000 36d77cc6 2000004f // 00000000 36d77aba 20000040 // 00000000
36d57306 00000000 // 00000000 0c12810c 200005ee
0c4: 00000000 11fbad86 20000040 // 00000000 3738a516 20000040 // 00000000
36d50456 20000040 // 00000000 36d5bb3a 20000040
0c8: 00000000 36d7bb7a 20000066 // 00000000 36d5003e 2000005e // 00000000
36d588ae 20000040 // 00000000 37d66d86 20000040
0cc: 00000000 36d5fc86 20000040 // 00000000 36d4a662 20000066 // 00000000
37c5c07e 20000040 // 00000000 37d66556 20000040
0d0: 00000000 36d7707e 20000040 // 00000000 36d5e34a 20000040 // 00000000
36d5b92e 20000040 // 00000000 36d7bd86 20000040
0d4: 00000000 36d7b556 20000040 // 00000000 11fba13e 20000040 // 00000000
36d7686e 20000040 // 00000000 3738a30a 20000040
0d8: 00000000 36d55762 20000040 // 00000000 36d7b13e 20000040 // 00000000
0cb35a7a 20000040 // 00000000 36d60cc6 20000040
0dc: 00000000 021d5d46 20000040 // 00000000 021d5516 20000040 // 00000000
11fba762 20000040 // 00000000 0cb3503e 20000040
0e0: 00000000 11fba612 00000000 // 00000000 12a6d34c 200005ee // 00000000
36d4a86e 20000040 // 00000000 36d54722 20000040
0e4: 00000000 36d5430a 20000040 // 00000000 12137662 20000040 // 00000000
37d7c4d6 20000040 // 00000000 323aaaba 20000040
0e8: 00000000 36d7a92e 20000040 // 00000000 36d608ae 20000040 // 00000000
36d74d86 20000040 // 00000000 0a09e0be 20000040
0ec: 00000000 36d592ca 20000040 // 00000000 12137a7a 20000040 // 00000000
36d540fe 20000040 // 00000000 1213786e 20000040
0f0: 00000000 36d5596e 20000040 // 00000000 323aa8ae 20000040 // 00000000
1213724a 20000040 // 00000000 36d5086e 20000040
0f4: 00000000 0cb35c86 20000040 // 00000000 1df4a13e 20000040 // 00000000
36d7624a 20000040 // 00000000 36d5f456 20000040
0f8: 00000000 36d73d46 20000040 // 00000000 36d7330a 20000040 // 00000000
36d5bd46 20000040 // 00000000 11fbab7a 20000040
0fc: 00000000 1df4a96e 20000040 // 00000000 0a09ed06 20000040 // 00000000
0213cafa 20000040 // 00000000 3738ab3a 20000040

Additional info:

Comment 1 Jeff Burke 2008-02-20 19:11:13 UTC
Created attachment 295441 [details]
Full log

Comment 2 Jeff Burke 2008-02-20 19:20:59 UTC
Reverse Engineered nForce ethernet driver
RHEL driver based on upstream driver version 0.60
Also includes additional upstream commits:
3ba4d093fe8a26f5f2da94411bf8732fa6e9da86 forcedeth: fix tx timeout
fcc5f2665c81e087fb95143325ed769a41128d50 forcedeth: fix nic poll
6fedae1f6e66ab5f169bf58064e23e015fc1307d forcedeth: fix checksum feature in mcp65
caf96469e8ab57170cc8ca9c59809132d38e529e forcedeth: disable msix
e0379a14fc80cb98978fa86989dab77b522a8106 forcedeth: fixed missing call in napi poll
a7475906bc496456ded9e4b062f94067fb93057a forcedeth: msi bugfix

Comment 3 Andy Gospodarek 2008-02-20 20:23:26 UTC
Can I have access to the machine?  Of the patches included in this forcedeth
update, these one is the one that was designed to fix this problem upstream:

a7475906bc496456ded9e4b062f94067fb93057a forcedeth: msi bugfix

What's interesting is that on rhel5 it doesn't have the desired effect -- that
interrupts are correctly disabled when we hope they are.  I've been looking at
another interesting forcedeth problem that seems to be related to this, so I'd
like to see if this can be tested with pci=nomsi on the kernel command line. 
I'm guessing it's enabled right now.


Comment 4 Andy Gospodarek 2008-02-21 03:34:52 UTC
As I suspected, the patch that was added in 2.6.18-50.1.3 for the forcedeth msi
bugfix seems to be giving us problems here.  NFS connectathon test results:

2.6.18-53.1.2 -- pass
2.6.18-53.1.3 -- fail
2.6.18-53.1.3, with pci=nomsi on kernel cmd line -- pass

I'm hoping I can do some work on the forcedeth driver to resolve this since I'm
a bit worried that trying to pull all the MSI fixes from the latest upstream
will be too much.

Comment 5 Andy Gospodarek 2008-02-22 21:08:40 UTC
I would strongly encourage us to NOT revert this patch from rhel5.  I would
rather see us apply a patch on top to resolve this issue.  Without this there
will be problems since we are not really enabling and disabling the correct
interrupts.  I would rather correct the issue that paper-over a new problem by
removing the needed patch.

Comment 6 Andy Gospodarek 2008-02-29 20:22:14 UTC
I've started to notice what I feel are problems with enable_irq and disable_irq
calls in the forcedeth driver.  I recently patched the ethtool_set_settings
function because I determined that writing to the BMCR register while interrupts
were disabled resulted in no interrupts ever coming back out of the hardware. 
My guess is that changes to interrupt handling upstream have made issues like
dropping pending interrupts (or saving them so they can be posted later) may be
somewhat related, but this is just a hunch based on what I've observed.

Comment 7 Andy Gospodarek 2008-03-07 15:48:08 UTC
After a small patch to the MSI subsystem I can now run the NFS connectathon
tests on the same system used in the original test
(hp-xw9400-01.rhts.boston.redhat.com) and it appears that not tests failed. 
There isn't any great output indicating that, but I see nothing but 'PASS'
messages on the screen and none of the original messages:

eth0: too many iterations (6) in nv_nic_irq.

There were a few messages in test output like this:

Mar  7 09:23:29 hp-xw9400-01 kernel: nfs: server sol9-nfs not responding, still
trying
Mar  7 09:23:29 hp-xw9400-01 kernel: nfs: server sol9-nfs OK
Mar  7 09:23:34 hp-xw9400-01 kernel: nfs: server sol9-nfs not responding, still
trying
Mar  7 09:23:34 hp-xw9400-01 kernel: nfs: server sol9-nfs not responding, still
trying
Mar  7 09:23:34 hp-xw9400-01 kernel: nfs: server sol9-nfs OK
Mar  7 09:23:34 hp-xw9400-01 kernel: nfs: server sol9-nfs OK
Mar  7 09:23:39 hp-xw9400-01 kernel: nfs: server sol9-nfs not responding, still
trying

but I don't know if that was caused by the test or not.

I also looked at
/mnt/tests/kernel/filesystems/nfs/connectathon/cthon04/result.txt and it appears
to be zero length -- hopefully that's good.



Comment 8 Andy Gospodarek 2008-03-07 16:15:22 UTC
Oh yeah, test kernels available here:

http://people.redhat.com/agospoda/

Comment 9 Andy Gospodarek 2008-03-07 17:16:30 UTC
This patch (or something similar) is what I would like to consider for rhel5.2
(if possible).

http://people.redhat.com/agospoda/rhel5/irq-msi-upstream-fixes.patch

The problem I currently see is that this is only a few (5-6) of the patches
needed  to make this work whereas there are close to a dozen patches in the
original upstream set.  I can look over the changes, but am not an expert on
this, so I will probably need to get someone to at least keep me in check
(whether they are an expert or not).



Comment 11 RHEL Program Management 2008-07-14 22:04:11 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 13 Marcin Kowalski 2008-07-18 09:29:13 UTC
Created attachment 312109 [details]
RHEL5.2 Forcedeth Failure

Comment 14 Marcin Kowalski 2008-07-18 09:32:41 UTC
We have experienced a similar NIC crash with RHEL 5.2 running on the Nvidia Chipset 

00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)
00:09.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)

The full log details have been attached to this bugreport.

Has any progress been made on further testing and integrating the suggested
fixes in RHEL5.2 ?
Is it known whether RHEL5.0 was prone to the same bug ?


Comment 15 Andy Gospodarek 2008-07-18 18:12:33 UTC
RHEL5 should not be problematic, but later kernels will have problems.  What is
unfortunate is that a small set of users had problems that appeared in 5.2 from
a patch that fixed problems that all users would have on 5.1.

The root of the 5.2 issues is some MSI problems in 2.6.18 that were fixed in
2.6.19 and later.  Those patches will soon be added to my test kernels and will
appear in the kernel version:

2.6.18-94.el5.gtest.50

that will appear here:

http://people.redhat.com/agospoda/#rhel5

later today.


Comment 27 Andy Gospodarek 2008-09-10 16:48:46 UTC

*** This bug has been marked as a duplicate of bug 428696 ***


Note You need to log in before you can comment on or make changes to this bug.