From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.6) Gecko/20040116 Description of problem: I have a couple of Dell PowerEdge servers with dual Broadcom BCM5704 gigabit network cards in them running Fedora Core 2. On both machines, if I wire the card to a particular switch and use the standard "tg3" driver, I get intermittent blocking of the traffic (anything between 10 seconds and 10 minutes) and then resumption of the traffic, plus lots of receive errors (no transmit errors though) reported by "ifconfig". If I wire the cards to another (different) switch, there are no network errors ! A different card (Intel one) in the same machines works fine with any switch it's plugged into, suggesting some interaction between the tg3 driver and certain brands of switch. Version-Release number of selected component (if applicable): kernel-2.6.5-1.358smp How reproducible: Always Steps to Reproduce: 1. Install a Broadcom NetXtreme BCM5704 gigabit card. 2. Make sure it uses the FC2 "tg3" driver (/etc/modprobe.conf). 3. Plug net connection into a switch (brand will matter - more info to follow). 4. Get a large (100's MBs) file from another machine onto the machine with the Broadcom card. 5. Check "ifconfig <device>" (e.g. eth0) to see if there's any errors. Actual Results: It should report no errors from "ifconfig <device>". Expected Results: "ifconfig <device>" reported a lot of errors (inc. frame problems): UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3101000 errors:377040 dropped:0 overruns:0 frame:5231 TX packets:3946056 errors:0 dropped:0 overruns:0 carrier:0 Note the lack of transmit errors as well. However, an identical card attached to a different switch works fine (as do other cards - e.g. using the "e1000" driver - attached to any of the switches I've tried). My guess is that the combination of the tg3 driver and particular brands of switch just don't talk correctly to each other. Additional info: I've been trying all sorts of params to "ethtool" to see if anything helps and it doesn't. My only saving grace here is that there's a third card in the two machines ("Intel Corp. 82545GM Gigabit Ethernet Controller") which works perfectly with the same switches ("e1000" driver this time) that I had trouble with when using the "tg3" driver. I'll see if I can dig out some info on the brand of switch that's causing the problem and it to this bug report (I'm not at work at the moment, so I can't post that up). It's bad enough that I don't think I can recommend Broadcom cards to anyone using FC2...
Just a note that the BCM5704 card has no errors when attached to an Edimax switch (but wrongly auto-neg's with it - I've added a comment about this another bug that was closed, but may need to be reopened). The problems I reported seem to occur only between that card and a 3com switch, so we're now using the Intel card I mentioned with the 3com switch instead and that works with no errors.
I get a similar problem on a Dell PowerEdge 2550 (933MHz PIII w 1.5GB RAM). With a bunch of network activity, the network drops out after 5-10 seconds on the gigabit ethernet port. With little to no activity, it still happens, but takes 10-20 minutes. `ifdown eth1 ; ifup eth1` works to bring the network back up for a short while. The ethernet chipset that exhibits the problem is "Broadcom Corporation NetXtreme BCM5700 Gigabit Ethernet (rev 10)" from lspci. The tg3 module is loaded. The ethernet port is plugged into a gigabit switch (3Com) and negotiates at 1Gb. Let me know if I should provide more information. -Ryan dmesg reports: irq 11: nobody cared! (screaming interrupt?) Call Trace: [<021070c9>] __report_bad_irq+0x2b/0x67 [<02107161>] note_interrupt+0x43/0x66 [<02107327>] do_IRQ+0x109/0x169 [<0223007b>] sock_ioctl+0x13e/0x280 [<0211af64>] __do_softirq+0x2c/0x73 [<021078f5>] do_softirq+0x46/0x4d ======================= [<0210737b>] do_IRQ+0x15d/0x169 [<0210403b>] default_idle+0x23/0x26 [<0210408c>] cpu_idle+0x1f/0x34 [<02318612>] start_kernel+0x174/0x176 handlers: [<62c7da6a>] (tg3_interrupt+0x0/0xe8 [tg3]) Disabling IRQ #11 tg3: tg3_stop_block timed out, ofs=3400 enable_bit=2 tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2 tg3: tg3_stop_block timed out, ofs=1400 enable_bit=2 tg3: tg3_stop_block timed out, ofs=c00 enable_bit=2 tg3: eth1: Link is up at 1000 Mbps, full duplex. tg3: eth1: Flow control is on for TX and on for RX. eth1: no IPv6 routers present irq 11: nobody cared! (screaming interrupt?) [repeating ....]
Another solution I've found (if you've got only BCM57XX cards and no alternative) is to switch to the BCM5700 "official" driver from http://www.broadcom.com/drivers/downloaddrivers.php but remember that you'll need the kernel source and gcc toolchain installed before you try to build that driver from source on FC2. Other things you'll need to know about the BCM5700 driver (version 7.1.22): * You need to change line 1763 of src/b57um.c to read: dev->name, smp_processor_id()); [i.e. change hard_smp_processor_id() to smp_processor_id()] * "make install" the driver so that it's installed in the right place and a "depmod -a" command is correctly issued. * Edit /etc/modprobe.conf to change tg3 references to bcm5700. * At this point, the easiest thing to do is to reboot to pick up the new driver, but you might be able to pick it up with something like "init 1" followed by "init 3", but no guarantees (cos I haven't tried that - I chickened out and rebooted). A simple "/etc/init.d/network restart" may not be good enough, BTW. I've done this on a couple of PowerEdges and the errors with the tg3 driver went away when the bcm5700 driver was used and things are looking good now. You should note that (now correctly closed) bug #124857 has more info about why you have to edit the driver source - see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=124857
Fedora Core 2 has now reached end of life, and no further updates will be provided by Red Hat. The Fedora legacy project will be producing further kernel updates for security problems only. If this bug has not been fixed in the latest Fedora Core 2 update kernel, please try to reproduce it under Fedora Core 3, and reopen if necessary, changing the product version accordingly. Thank you.