Bug 98767

Summary: (NET 3C59X) TCP Transmit errors on 3C905TX cards
Product: [Retired] Red Hat Raw Hide Reporter: Dan Egli <dan>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 1.0CC: alexl, barryn, davej, dgenn, edwinh, eep2, emmanuel, hps, hugh_caley, iainr, jim, jlcthibo, kevymac, misek, mkanat, nbryant, pavelr, r.pallucchini, tmokros, yaoz
Target Milestone: ---   
Target Release: ---   
Hardware: i586   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-30 03:39:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 100643    

Description Dan Egli 2003-07-08 17:40:36 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.4) Gecko/20030624

Description of problem:
Just upgraded two different machines from RedHat Linux to RedHat RawHide. One
was running RH 8.0, the other 9. RH8 box, we will call Gateway. RedHat 9 we will
call BreakMe.

Gateway hardware config configuration:
AMD Thunderbird 1.33GHz CPU
512MB Physical Ram. 
2x swap partitions, total of 2GB of Swap
2 NICS, Belkin FastEthernet Express (8139too driver)
        3Com 905TX (yes, 905A).
Asus A7V133 MBD

Breakme Hardware Config
AMD K6-300 MHz
192MB Ram
Tyan Trinity MBD
1 NIC: 3COM 905TX.

on EACH BOX, the 3com cards worked fine till rebooting after
installing RawHide. Upon reboot, the following is a syslog extract from Gateway:

Jul  7 16:12:21 shortcircuit kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Jul  7 16:12:21 shortcircuit kernel: See Documentation/networking/vortex.txt
Jul  7 16:12:21 shortcircuit kernel: 00:09.0: 3Com PCI 3c905 Boomerang 100baseTx
at 0xa400. Vers LK1.1.18-ac
Jul  7 16:12:21 shortcircuit kernel:  00:60:97:c5:70:8b, IRQ 15
Jul  7 16:12:21 shortcircuit kernel:   product code 4848 rev 00.0 date 04-11-97
Jul  7 16:12:21 shortcircuit kernel:   64K word-wide RAM 1:1 Rx:Tx split,
autoselect/10baseT interface.
Jul  7 16:12:21 shortcircuit kernel:   Enabling bus-master transmits and
whole-frame receives.
Jul  7 16:12:21 shortcircuit kernel: 00:09.0: scatter/gather enabled. h/w
checksums disabled
Jul  7 16:12:21 shortcircuit kernel: eth0: Transmit error, Tx status register d0.
Jul  7 16:12:21 shortcircuit kernel:   Flags; bus-master 1, dirty 1(1) current 1(1)
Jul  7 16:12:21 shortcircuit kernel:   Transmit list 00000000 vs. deeec240.
Jul  7 16:12:21 shortcircuit kernel:   0: @deeec200  length 8000002a status 8000002a
Jul  7 16:12:21 shortcircuit kernel:   1: @deeec240  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   2: @deeec280  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   3: @deeec2c0  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   4: @deeec300  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   5: @deeec340  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   6: @deeec380  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   7: @deeec3c0  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   8: @deeec400  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   9: @deeec440  length 00000000 status 00000000
Jul  7 16:12:21 shortcircuit kernel:   10: @deeec480  length 00000000 status
00000000
Jul  7 16:12:21 shortcircuit kernel:   11: @deeec4c0  length 00000000 status
00000000
Jul  7 16:12:21 shortcircuit kernel:   12: @deeec500  length 00000000 status
00000000
Jul  7 16:12:21 shortcircuit kernel:   13: @deeec540  length 00000000 status
00000000
Jul  7 16:12:21 shortcircuit kernel:   14: @deeec580  length 00000000 status
00000000
Jul  7 16:12:21 shortcircuit kernel:   15: @deeec5c0  length 00000000 status
00000000

ifconfig shows 0 transmitted packets, and several error TX packets.


It is NOT kernel related because I have tried three different kernels, including
the stock RedHat 8.0 kernel. Same result. Replacing the 3com card with an e100
card (Intel In-Business network adapter) made errors go away on Gateway. After
this writing I'm going to replace the 3com card with a NetGear (Tulip chipset)
FA310TX card I have and I suspect it will make the errors on breakme go away.




Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install RedHat Linux (any ver) on a machine with a 3C905TX NIC
2. Upgrade to RedHat RawHide 1.0
3. Reboot
    

Actual Results:  3Com card will not initialize properly. Get above shown errors.

Expected Results:  System should boot normally. 

Additional info:

I was thinking perhaps the 3com card had gone bad, till the exact same results
occured on a separate machine after installing RawHide. I cannot believe two
different NICs in two different machines would go bad at the same time, and not
till a reboot. The odds are extreemly against that happening.

Comment 1 Bill Nottingham 2003-08-01 02:01:05 UTC
*** Bug 101427 has been marked as a duplicate of this bug. ***

Comment 2 Dan Egli 2003-08-03 19:21:32 UTC
Now I'm confused. I specifically stated that this should NOT be a kernel bug
since three different kernels (Two Stock Kernels from distributions, one
downloaded from kernel.org and compiled) all do the exact same thing. So given
that fact, perhaps someone can explain why this was then re-flagged as a kernel bug?

Comment 3 Christian Thibodeau 2003-10-01 22:00:23 UTC
This problem occurs as well on a clean install of Fedora Core test2.

Comment 4 Iain Rae 2003-10-04 13:28:25 UTC
I've see the same thing with 2 3c905 cards on an old PII that I use for testing,
swapping in a 3c905b everything seems to be ok which seems a bit odd.

Comment 5 Mike Levine 2003-10-04 15:35:26 UTC
I swap to HD's on RH9 and the other RHRawhide,
3c905 works on RH9 but not on Rawhide.
I, too, am using an AMD.
Athlon XP 1700+
1 GB of RAM

Oct  4 01:21:37  kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct  4 01:21:37  kernel: See Documentation/networking/vortex.txt
Oct  4 01:21:37  kernel: 00:09.0: 3Com PCI 3c905 Boomerang 100baseTx at 0xd000.
Vers LK1.1.18-ac
Oct  4 01:21:37  kernel:  00:60:08:19:c5:a4, IRQ 10
Oct  4 01:21:37  kernel:   product code 4b4b rev 00.0 date 06-13-97
Oct  4 01:21:37  kernel:   64K word-wide RAM 1:1 Rx:Tx split, autoselect/10baseT
interface.
Oct  4 01:21:37  kernel:   Enabling bus-master transmits and whole-frame receives.
Oct  4 01:21:37  kernel: 00:09.0: scatter/gather enabled. h/w checksums disabled
Oct  4 01:21:37  kernel: eth0: Dropping NETIF_F_SG since no checksum feature.
Oct  4 01:21:37  kernel: eth0: Transmit error, Tx status register d0.
Oct  4 01:21:37  kernel:   Flags; bus-master 1, dirty 1(1) current 1(1)
Oct  4 01:21:37  kernel:   Transmit list 00000000 vs. f2916240.
Oct  4 01:21:37  kernel:   0: @f2916200  length 8000002a status 8000002a
Oct  4 01:21:37  kernel:   1: @f2916240  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   2: @f2916280  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   3: @f29162c0  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   4: @f2916300  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   5: @f2916340  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   6: @f2916380  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   7: @f29163c0  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   8: @f2916400  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   9: @f2916440  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   10: @f2916480  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   11: @f29164c0  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   12: @f2916500  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   13: @f2916540  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   14: @f2916580  length 00000000 status 00000000
Oct  4 01:21:37  kernel:   15: @f29165c0  length 00000000 status 00000000
Oct  4 01:21:39  network: Bringing up interface eth0:  succeeded


Comment 6 Pavel 2003-10-08 16:21:40 UTC
Same here with Fedora Core test2 + rawhide. Bug doesn't happen with
2.6.0-test6.1.49 kernel.

Comment 7 Kevin 2003-10-08 23:33:35 UTC
Can you please try using mii-tool and forcing it to not use autonegotiation and
see if that helps the problem at all. I had a similar problem and this remedied
the problem.

Comment 8 Alexander Larsson 2003-10-09 08:23:55 UTC
Any chance this is related to bug 98832? Does removing kudzu from the boot fix
anything?


Comment 9 Barry K. Nathan 2003-10-12 21:52:13 UTC
Dan: All of the kernels you compared are too similar to say for sure that this
isn't a kernel bug.

Everyone: Is anyone experiencing this problem *without* rhgb? If you're using
rhgb, try booting with "nogui" and see if the problem goes away.

Comment 10 Dan Egli 2003-10-12 22:30:28 UTC
have no idea what rhgb is but the results of booting BREAKME with kudzu not
loading and with the nogui kernel paramaters are as follows:

kernel command line: linux
kudzu status: on
result: same errors

kernel command line: linux
kudzu status: off
result: seems fine

kernel command line: linux nogui
kudzu status: on
result: same errors

kernel command line: linux nogui
kudzu status: off
result: seems fine

so it appears that for whatever, kudzu is the culprit here.

Comment 11 Barry K. Nathan 2003-10-13 00:44:45 UTC
rhgb = Red Hat Graphical Boot. It's what you get by default if you do not
specify "nogui".

In my case, disabling kudzu doesn't make a difference, but "nogui" does...

Comment 12 Pavel 2003-10-13 08:28:24 UTC
I see it with nogui boot. Also, I don't start the interface on boot, but start
it manually later instead.
P.S. 2.6.0 kernels do not have this problem, so it probably IS a kernel bug.

Comment 13 Erik Petersen 2003-10-20 03:00:30 UTC
Same problem with Fedora 0.95 (test 3) on 3c905-TX

nogui had no effect but turning kudzu off fixes the problem

Comment 14 Bill Nottingham 2003-10-21 20:11:56 UTC
*** Bug 100470 has been marked as a duplicate of this bug. ***

Comment 15 Bill Nottingham 2003-10-21 20:24:53 UTC
*** Bug 105684 has been marked as a duplicate of this bug. ***

Comment 16 Max Kanat-Alexander 2003-10-29 22:09:04 UTC
Donald Becker, the maintainer of the 3c905 (3c59x) driver, has a mailing list
for it. The archives are over at:

http://www.scyld.com/pipermail/vortex/

It might be good to search the archives and try out some of the various fixes
for problems other people have had. The 3c905TX is mentioned pretty frequently.
I would have done more research myself, but I don't have the card to test the
solutions on.

Somebody with the card could also email the list and Donald would probably be
able to figure out what's going on.

-M

Comment 17 Iain Rae 2003-11-03 02:36:55 UTC
OK, Yesterday I reinstalled the PC with one of the 905's installed
using the current version of fedora-test, the initial install went
fine until the first reboot. 

The PC hung at the detecting new hardware message and I  hit
ctrl-alt-del and then the reset button when it was still sitting there
20 minutes later.

next time round it managed to get up to running X but the ethernet
card wasn't seeing the network and I was getting error messages
similar to above. I swapped the 905 for the 905b, kudzu detected the
change and fedora came up with a working ethernet card.

I then installed all the updates, shut the machine down again and
swapped the 905 back in, graphical boot has dissapeared, kudzu again
detected the change and we don't have a working ethernet card.

The machine also has redhat 9 so reboot the PC into rh9, same problem,
card can't see the network and error messages as above.

shut the machine down (and power off) then boot into rh9, card works
fine, reboot into fedora: dead card.

try running mii-tool --force=100baseTx-FD, doesn't work, try
100baseTx-HD still doesn't work, just running mii-tool reports that
there is
"No MII Trasciever present!"

move /etc/rc.d/init.d/kudzu out of the way and reboot.


working network card.

Run /usr/sbin/kudzu and we still have a working network card,
re-enable /etc/rc.d/init.d/kudzu reboot and dead network card again.

So it seems that I have a working 905 if kudzu isn't run during
startup/shutdown.


If there's any other info I can generate to shed light on this feel
free to ask.

Comment 18 Yao Zhang 2003-11-03 21:00:02 UTC
I can confirm Iain Rae's finding.

I've filed a similar bug#105684 which is marked as duplicate of this
one.  I just tested like this:

1.  Boot to 2.6.0-0.test9.1.67, the network works fine.  Then
    /sbin/chkconfig --level 5 kudzu off

2.  Reboot to 2.4.22-1.2115.nptl.  The network still works.

What I can tell is that under 2.6, when kudzu is on, there is not
problem with the network.  But under Fedora Core's 2.4.22, when
kudzu is on, the 3C90x is not working.  It turning off kudzu under
Fedora Core's 2.4.22, the NIC works fine.

Comment 19 Max Kanat-Alexander 2003-11-03 21:58:11 UTC
alexl --

Bug 98832 gives me a "You are not authorized to access bug #98832". It
seems that it is related, since all reporters say that disabling kudzu
un-breaks it. If it's being worked on, and its resolution would
resolve this, could you tell us? :-)

I added you to the CC list so that you would get this message. Feel
free to remove yourself if you want.

-M

Comment 20 Dan Gennidakis 2003-11-04 04:15:40 UTC
I can confirm Iain Rae's finding. On my Asus P2B mainboard when Kudzu
is turned off for the boot runlevel my 3c905 initializes ok. Shutting
down Kudzu post boot does not allow the network card to ifup properly.
The only solution is to have Kudzu off at the runlevel boot. I have
had the same problem since test 1 and I am using test 3 now. Never had
problems with redhat 7 to 9.

Comment 21 Alexander Larsson 2003-11-04 08:17:42 UTC
I made bug 98832 readable. It doesn't contain much info though.

Comment 22 Max Kanat-Alexander 2003-11-04 08:36:19 UTC
For consistency, this bug should be in kudzu and notting should be the
assignee, since it's clearly a kudzu issue at this point. Also,
shouldn't platform be "all" -- it's been reported on an athlon as well
as an i586 (and a zillion other machines, apparently).

Could somebody with the appropriate permissions make these changes?

(Hahaha -- I just tried to commit the bug [having changed the
component to kudzu without the permissions, being used to having
Bugzilla access elsewhere] and it informed me that I was changing it
from BitchX to kudzu... Ah, if only.)

-M

Comment 23 Bill Nottingham 2003-11-05 14:13:24 UTC
This is not a kudzu issue. All it does is call ethtool ioctls, and
then the driver freaks out. All other drivers handle this fine...

Comment 24 Barry K. Nathan 2003-11-05 15:27:06 UTC
And in my case (a 3C905C-based Cardbus card) the problem seems to be
triggered by rhgb, not kudzu. (I haven't tried recent rhgb releases,
i.e. the exact revision shipping with the Fedora Core 1 release, so I
don't know if they happen to fix anything somehow. I'll see if I can
test this again soon.)

Comment 25 chip piller 2003-11-05 16:50:20 UTC
For me, running Fedora Core Test3, kernel 2.4.22-1.2115.nptl, Asus P2B
mainboard and 3C905B I get no errors; with or without rhgb, with or
without kudzu.  

Comment 26 Dan Gennidakis 2003-11-05 18:30:51 UTC
For me, running Fedora Core Test3, kernel 2.4.22-1.2115.nptl, Asus 
P2B mainboard and 3C905 I get errors still; with kudzu on at the 
boot runlevel. I have applied all the rawhide updates and have tried 
both the nptl kernels and the same problems exist. I also tried the 
Redhat 9 kernel 2.4.20-9 and the same problem exists. The nic 
initializes properly only when kudzu is off at boot time.  


Comment 27 Dan Gennidakis 2003-11-07 18:16:17 UTC
I downgraded Kudzu to ver 0.99 (from RH9) then booted with kudzu on 
at the runlevel and my 3c905 initialized without problems. As soon 
as I go to a version of kudzu after RH9's version (Fedora test1 - 
yarrow) the NIC does not initialize properly and mii-tool reports 
missing MII. When the NIC is initialized properly at boot time 
(kudzu off) mii-tool reports the negotiated link speed and status 
properly.

Comment 28 roby palluk 2003-11-09 08:32:22 UTC
I have a Fedora core 1, 3c905b (cyclone) PIII550 Asus P3B-f. The eth
is configurated to  have ip address from dhcp and at startup have an
error message "link don't available, check the cable". To start the
nic i need to do:
ifconfig eth0 up
ifup eth0
after the nic work... sometime i have a network mistakes (i mean), i
download some program from internal ftp server and sometime i don't
able to get it and to the hub see a lot of collision! next i tried to
smb connection and work fine!
At this moment this is my condiction... i want to try to recompile
kernel (i need for ntfs driver) and if any changing i mean to update
to 2.6 kernel... 

Comment 29 Jason Montleon 2003-11-28 02:58:54 UTC
I have three of these cards that do this, all 3c905-TX.  If you need 
one to work with email me, happy to donate.

Comment 30 Dan Egli 2003-12-23 05:54:36 UTC
Anyone got a current status on this one? I'd really like to see it
resolved. It's 5 1/2 months old now. 

Comment 31 manuel wolfshant 2003-12-31 19:23:57 UTC
For what is worth: we are 2.5 hours before switching to 2004 and the
bug is still there. Just installed fedora (and I was hit by the
bug..). Applied all available patches (including a switch from stock
-2115nptl kernel to -2135nptl) but still the only way to have a
functional network is disabling kudzu at startup.
I have also performed different variants of rmmod 3c59x/mii-tool -r /
mii-tool -R/ mii-tool -F but to no avail. The driver seems to enter
into a neverending play-dead state after kudzu's magical touch.

Comment 32 Henning Schmiedehausen 2004-01-02 11:19:23 UTC
I can confirm this behaviour (3C905 (without any A, B or C) with
current Fedora Core 1 (all Upgrades installed, Kernel 2135). 

Removing kudzu from the startup fixed the problem completely.

This machine has been network-kickstarted, so installing (and moving
700 MBytes over the NIC while installing) did work but the kudzu
screws up the NIC. Maybe this is a driver issue, where loading the
driver itself doesn't reset the NIC or MII correctly.

Comment 33 Jerome Walter 2004-04-01 01:32:35 UTC
I can confir; the bug is still present in latest Fedora Core 2 Test2
just released.

The NIC works perfectly from CDrom (did you see there is even ssh tu
continue to work during the hour of installation ?), but not after reboot.

Does a developper read the bugzilla ? ;)

Comment 34 O.K. 2004-04-28 21:57:02 UTC
IT WORKS!
Just installed 2.6.5-1.327smp
(http://download.fedora.redhat.com/pub/fedora/linux/core/test/1.92/i386/os/Fedora/RPMS/kernel-smp-2.6.5-1.327.i686.rpm)
on my fc2-test2 machine and my 3com 3c905 started to work with tcp again.
the nic was re-detected as "3Com 3c590/3c595/3c90x/3cx980" instead of
"3c905 100BaseTX [Cyclone]". 


Comment 35 Hugh Caley 2004-06-23 18:24:43 UTC
I can confirm that this problem still occurs with Fedora-Core-1 and
the 2.4.22-1.2188.nptl kernel and the 3Com Corporation 3c905B
100BaseTX [Cyclone] card.  Many problems with this card, including
autonegotiation to 100 Mb FD not working, lockups under nfs load and
errors such as the following:

Jun 21 16:24:36 pinzolo kernel: eth0: Transmit error, Tx status
register 82.
Jun 21 16:24:36 pinzolo kernel: Probably a duplex mismatch.  See
Documentation/networking/vortex.txt
Jun 21 16:24:36 pinzolo kernel:   Flags; bus-master 1, dirty 996(4)
current 997(5)
Jun 21 16:24:36 pinzolo kernel:   Transmit list 1ea51300 vs. dea51300.
Jun 21 16:24:36 pinzolo kernel:   0: @dea51200  length 800000be status
000100be
Jun 21 16:24:36 pinzolo kernel:   1: @dea51240  length 800000be status
000100be

Turning off kudzu fixes it.

Comment 36 Xabier Yeregi 2005-08-06 21:48:37 UTC
I was able to reproduce the same problem but with another distro, Gentoo. I
booted a P3 667MHz computer with 3c905tx-nm with the Gentoo x86 Minimal
Instalation CD (downloaded 8/6/2005): 2.6.11-gentoo-r3 kernel, mii and 3c59x as
modules loaded by kudzu. I assigned an IP address to eth0 and could not ping to
another computer on the same Ethernet segment. lsmod showed that both mii and
3c59x were properly loaded. I tested the network cables, the Ethernet switch and
I was sure that de NIC had to work as I had been using it with another OS.

Following the comments in this thread, I disabled kudzu at the Instalation CD
startup passing nodetect option to kernel. I loaded 3c59x with modprobe,
assigned an ip address and now it can correctly ping to other hosts.



Comment 37 Xabier Yeregi 2005-08-06 21:53:13 UTC
I think this links might cast some light:
http://www.scyld.com/pipermail/vortex/2000-June/000425.html