I have a few machines (4) which are experiencing intermittent network card/driver lockup problems. The machines are dual 600 P3 machines. lsmod shows the eepro100.o driver is loaded. When the lockup occurs the machine is unpingable. I was able to confirm that an ifconfig eth0 down up resets the card/driver and restores connectivity. eth0: OEM i82557/i82558 10/100 Ethernet at 0x9400, 00:D0:B7:20:37:BB, IRQ 19. Receiver lock-up bug exists -- enabling work-around. Board assembly 733470-004, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b).
More info: I find out that the actual chipset on the cards is in fact Intel 82559. It seems that the driver mis-identifies this chipset and so possibly disagrees with it in some fashion. At this time I have replaced 6 NICs with 3C905B and things seem to be working well. Looks like the eepro100.c driver potentially needs a tweak.
The lockups have continued even with 3Com 3C905c network cards. We now suspect that there is an SMP problem and have booted the non-SMP kernel on one machine to test.
Ok, so now we know that SMP is NOT the problem. We continue to have lockups with either kernel version.
Did these stop with the newer kernels in 6.2