After an upgrade to kernel-2.6.19-1.2288.fc5 the eth0 interface sometimes stopped responding. I am not super this is kernel problem or not. The computer has two active ethernet interfaces: eth0 and eth1. After some time (20 min to 4 hours) eth0 stops responding. ifdown eth0 ; ifup eth0 fixes this There is no error messages in dmesg or /var/log/messages # lspci 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2) 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a3) 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1) 00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2) 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 00:06.0 PCI bridge: nVidia Corporation Unknown device 0370 (rev a2) 00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 00:09.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 00:0a.0 PCI bridge: nVidia Corporation Unknown device 0376 (rev a3) 00:0b.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a3) 00:0c.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a3) 00:0d.0 PCI bridge: nVidia Corporation Unknown device 0378 (rev a3) 00:0e.0 PCI bridge: nVidia Corporation Unknown device 0375 (rev a3) 00:0f.0 PCI bridge: nVidia Corporation Unknown device 0377 (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 03:00.0 VGA compatible controller: Matrox Graphics, Inc. Unknown device 0522 (rev 02) 04:00.0 PCI bridge: NEC Corporation Unknown device 0125 (rev 06) 04:00.1 PCI bridge: NEC Corporation Unknown device 0125 (rev 06) 05:01.0 RAID bus controller: 3ware Inc 9550SX SATA-RAID 80:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2) 80:01.0 RAM memory: nVidia Corporation MCP55 LPC Bridge (rev a3) 80:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a3) 80:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 80:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 80:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 80:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 80:09.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 80:0a.0 PCI bridge: nVidia Corporation Unknown device 0376 (rev a3) 80:0b.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a3) 80:0c.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a3) 80:0d.0 PCI bridge: nVidia Corporation Unknown device 0378 (rev a3) 80:0e.0 PCI bridge: nVidia Corporation Unknown device 0375 (rev a3) 80:0f.0 PCI bridge: nVidia Corporation Unknown device 0377 (rev a3) uname -a Linux creditcardserver 2.6.19-1.2288.fc5 #1 Sat Feb 10 14:52:17 EST 2007 i686 athlon i386 GNU/Linux
Created attachment 148290 [details] dmesg
eth0 stopped responding again. Downgraded to kernel-2.6.18-1.2257.fc5.i686.rpm This is a mission critical server. Also see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=221951
lsmod Module Size Used by ipv6 245985 28 autofs4 21573 1 ip_conntrack_netbios_ns 3393 0 ipt_REJECT 5697 1 xt_tcpudp 3521 7 xt_state 2625 8 ip_conntrack 52085 2 ip_conntrack_netbios_ns,xt_state nfnetlink 7513 1 ip_conntrack iptable_filter 3393 1 ip_tables 13065 1 iptable_filter x_tables 14405 4 ipt_REJECT,xt_tcpudp,xt_state,ip_tables cpufreq_ondemand 6577 1 dm_mirror 29073 0 dm_mod 57433 1 dm_mirror video 17221 0 sbs 16257 0 i2c_ec 5569 1 sbs container 4801 0 button 7249 0 battery 10565 0 asus_acpi 16857 0 ac 5701 0 lp 13065 0 parport_pc 27493 0 parport 37001 2 lp,parport_pc ehci_hcd 31693 0 ohci_hcd 21341 0 serio_raw 7493 0 sg 34653 0 ide_cd 38625 2 cdrom 34913 1 ide_cd pcspkr 3521 0 i2c_nforce2 7617 0 k8_edac 14209 0 forcedeth 42949 0 edac_mc 23369 1 k8_edac sata_nv 11845 0 i2c_core 21697 2 i2c_ec,i2c_nforce2 libata 99161 1 sata_nv 3w_9xxx 32709 10 sd_mod 20929 53 scsi_mod 134121 4 sg,libata,3w_9xxx,sd_mod
by the way. when eth0 locks - I can ping it locally ping ip.of.eth0 works from the local computer (with IP ip.of.eth0) In the same time the command ping ip.of.eth0 does not work when pinged from another computer on the same subnet. ifdown eth0; ifup eth0 fixes this and the eth0 becomes visible from the subnet
People who are having this problem: try adding pci=nomsi to the kernel command line. Edit /etc/grub.conf so the line for kernel 2288 looks similar to below, then reboot. kernel /vmlinuz-2.6.19-1.2288.fc5 ro root=LABEL=/ pci=nomsi rhgb quiet
I will try this next weekend with 2.6.19 I can not do any test during business days. Also, were MSI enabled by default in 2.6.19-1.2288 ?
I was experiencing this problem to a maddening degree on an Asus M2N-E mainboard with the nVidia nForce 570 Ultra MCP chipset (MCP55 forcedeth). The eth0 drops were unpredictable but frequent, often requiring three or more down/up cycles of either ifdn/ifup or 'service network restart' to transfer a CD .ISO for example. No errors were logged in these instances, the interface simply stopped working. Worse, frequently when eth0 stopped working, *the entire LAN subnet* off that switch also stopped working, in a fashion which appeared similar to a chattering NIC in the network. Currently, with 2.6.19-1.2911.fc6 kernel, and using Chuck Ebbert's suggestion to add 'pci=nomsi' to the grub.conf kernel command line, the interface has not stopped or exhibited any other nonworking behavior for a full day now, including more than ten hours of actual use. Thanks to Mr. Ebbert for what appears to be a reliable workaround.
*** This bug has been marked as a duplicate of 222556 ***
Confirmed, The pci=nomsi option fixes this bug 229111