Bug 155010

Summary: 3com 3c905C network does not work
Product: [Fedora] Fedora Reporter: Orion Poplawski <orion>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: davej, henryhartley
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-06-13 15:06:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output from boot
none
sysreport output none

Description Orion Poplawski 2005-04-15 16:40:35 UTC
Description of problem:

System works fine with FC3.  Installed FC4t2 via network just fine.  On boot,
ethernet will not come up.  No errors, just no traffic.  tcpdump will see the
dhcp queries going out, but nothing coming in.

Version-Release number of selected component (if applicable):
kernel-2.6.11-1.1226_FC4

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Orion Poplawski 2005-04-15 16:40:36 UTC
Created attachment 113235 [details]
dmesg output from boot

Comment 2 Orion Poplawski 2005-04-25 21:28:38 UTC
Appears to be fixed in 1.1261_FC4

Comment 3 Orion Poplawski 2005-04-27 17:22:40 UTC
Appears to be back in 1.1268_FC4

Comment 4 John W. Linville 2005-04-27 17:42:49 UTC
Dumb question: do you have any indication that DHCP replies are actually being 
sent to you? 
 
Either way, please attach the output of running the "sysreport" command.  
Also, traces from tcpdump and/or ethereal may be interesting to see as well. 
 
Thanks! 

Comment 5 Orion Poplawski 2005-04-27 18:00:53 UTC
I think I've found the culprit.  When I first ran into this problem I added
"pci=routeirq" to my boot line.  Now, when this is present, I have problems. 
When I remove it, it's okay.

The dhcp server saw the dhcp requests in either case, but did not appear to see
the replies when pci=routeirq was set.

Comment 6 John W. Linville 2005-04-27 19:00:28 UTC
"When I remove it, it's okay" -- sounds great!  I'm closing on that basis.  
Please re-open if I have misinterpreted something.  Thanks! 

Comment 7 Orion Poplawski 2005-04-28 18:00:49 UTC
Okay, something is definitely screwed up.  System can get dhcp address, but
basically all network operations are running very poorly to the point of
complete failure.  This is now with kernel 1275.

See the following in the kernel log:

eth1: Too much work in interrupt, status 8401

ifconfig reports lots of errors:

eth1      Link encap:Ethernet  HWaddr 00:B0:D0:0D:F2:A3
          inet addr:192.168.0.72  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:412011 errors:11941303 dropped:0 overruns:1168 frame:3291969
          TX packets:233 errors:0 dropped:0 overruns:0 carrier:1
          collisions:0 txqueuelen:1000
          RX bytes:54952 (53.6 KiB)  TX bytes:15268 (14.9 KiB)
          Interrupt:11 Base address:0xec00

Again, this system runs fine with FC3.

Comment 8 Orion Poplawski 2005-04-28 18:02:55 UTC
Created attachment 113805 [details]
sysreport output

Comment 9 Orion Poplawski 2005-05-02 17:31:11 UTC
Still happens with 1276.  System is unusable with this.

Comment 10 John W. Linville 2005-05-04 16:48:46 UTC
Current issue sounds like 155837... 

*** This bug has been marked as a duplicate of 155837 ***

Comment 11 Orion Poplawski 2005-05-13 17:46:20 UTC
This isn't a duplicate.  The network device does not function at all. tcpdump
report no traffic at all.  Lots of errors.

Comment 12 Orion Poplawski 2005-05-13 17:48:01 UTC
Just tested with FC4t3 and kernel 1286 and still present.  FC4 is unusable on my
hardware.

Comment 13 Henry Hartley 2005-05-17 20:37:37 UTC
I'm having this problem also with a fresh FC4t3 install.  In fact, when I try to
start the network it says the hardware is not there.  Again, it worked fine with
FC3.

[root@localhost /]# /etc/sysconfig/network-scripts/ifup eth0
3c59x device eth0 does not seem to be present, delaying initialization.




Comment 14 John W. Linville 2005-05-19 20:25:35 UTC
I'm not sure the problem in comment 13 is actually the same thing.  Please 
open a new bugzilla for that issue, and include the output of running 
'sysreport' on that system.  Thanks! 

Comment 15 Henry Hartley 2005-05-23 13:28:59 UTC
I looked at it a bit more and agree that's it's probably not the same bug.  In
fact, I'm no longer convinced comment 13 is a bug at all.  I rebooted the
machine and the problem seems to have gone away.  The only other change I made
before rebooting was to fix a typo in the boot string related to the mouse, but
I doubt that had any relation to my problem.

Comment 16 Orion Poplawski 2005-05-23 15:06:43 UTC
Well, the problem appears to have gone away again.  Installed from rawhide and
with kernel 2.6.11-1.1319_FC4 I'm not seeing any trouble.  Hopefully it will
stay this way.

Comment 17 Orion Poplawski 2005-05-23 17:25:44 UTC
And is broken again with 1.1323.  This is reproducible.  I see problems every
time I boot to 1.1323, works fine when I boot 1.1319.

Problem symptoms:

dhcp works, but is quite slow to start.

After than services that use the network are *very* slow to start.  Never manage
to fully boot.

ifconfig during boot show many errors:

eth1      Link encap:Ethernet  HWaddr 00:B0:D0:0D:F2:A3
          inet addr:192.168.0.72  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:68099 errors:2270184 dropped:0 overruns:400 frame:274673
          TX packets:120 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:15130 (14.7 KiB)  TX bytes:11655 (11.3 KiB)
          Interrupt:11 Base address:0xec00

tcpdump when I ran it didn't show any traffic.

Don't see any obvious differences in dmesg output.

Comment 18 Dave Jones 2005-05-23 21:46:01 UTC
I don't think this is due to anything that changed in the kernel, and is just
down to random 'position of the moon' type failures. The changelog between
the working/non-working kernel doesn't touch this driver, or anything that could
(or should) affect it..

revision 1.1323
add a check for policy ver
----------------------------
revision 1.1322
debugfs/usbmon can come back.
----------------------------
revision 1.1321
missing symbols.
----------------------------
revision 1.1320
ipw updates.


Given the failure moves around at random but is repeatable on multiple boots of
the same kernel, I'm wondering if this is some alignment problem ?
(or *gulp*, a compiler problem)


Comment 19 Orion Poplawski 2005-05-23 23:02:43 UTC
How hard is it to build a FC4 kernel with the 3.4.3 compiler?  Barring that, I'm
willing to compile a debug kernel to test.  Let me know what would be useful. 

Comment 20 John W. Linville 2005-05-27 18:04:31 UTC
There was a recent upstream patch (included in latest FC4 kernel) that fixed a 
problem related to power management on some cards using the 3c59x.  I wonder 
if that could be at play here...  Have you tried kernel-2.6.11-1.1363_FC4?  If 
so, did it work any better w/ the 3c905C card? 

Comment 21 Orion Poplawski 2005-05-27 20:29:32 UTC
Not sure it has relevance, as the problem is "fixed" in at least one earlier
kernel.  What do you think of Dave's comment (#18)?

Comment 22 John W. Linville 2005-05-27 20:40:14 UTC
I think comment 18 is relevant, but I'm not sure exactly when this change made 
it into FC4.  The patch was pretty recent, so I'm not sure if it was in any of 
the previous FC4 kernels.  (Dave took a late update from upstream...) 
 
As I understand it, the problem only manifests if the card has been previously 
set a certain way.  So, rebooting a box may get a different reaction than a 
cold boot, rebooting between OSes may have different effects than rebooting 
the same OS, etc.  Anyway, I think it _might_ account for some of the 
wierdness here (and a couple of other 3c59x-related bugs), so I figured it 
would be worthwhile to ask you to try the latest kernel.  It's worth a shot? 

Comment 23 Orion Poplawski 2005-05-27 22:29:29 UTC
Well, 1363 does indeed work.  Unfortunately, I'm not sure this proves anything.
 However, I will continue to test new kernels to see if it fails again.

Comment 24 John W. Linville 2005-05-31 13:14:51 UTC
Please do keep me informed.  If I don't hear from you in the next few weeks, 
I'll presume that the problem has been fixed.  Thanks! 

Comment 25 John W. Linville 2005-06-13 15:06:16 UTC
Closing based on comment 24...