Bug 154512
Summary: | b44 driver constantly restarting when using the network | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mary Ellen Foster <mefoster> | ||||
Component: | kernel | Assignee: | John W. Linville <linville> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4 | CC: | davej, sergey_udaltsov | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-06-07 19:07:40 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Mary Ellen Foster
2005-04-12 12:45:06 UTC
Possibly relevant facts: the log messages when starting at home (which works) and at school (which doesn't) are different. At home, the only b44-related messages I can see in /var/log/messages are the following. I suspect the "NETDEV WATCHDOG" part is a symptom of the problem. Apr 11 21:53:43 floopy kernel: b44.c:v0.95 (Aug 3, 2004) Apr 11 21:53:43 floopy kernel: ACPI: PCI interrupt 0000:02:01.0[A] -> GSI 17 (level, low) -> IRQ 177 Apr 11 21:53:43 floopy kernel: eth0: Broadcom 4400 10/100BaseT Ethernet 00:11:43:67:8a:09 [...] Apr 11 21:53:43 floopy kernel: b44: eth0: Link is down. Apr 11 21:53:43 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 11 21:53:43 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 11 21:53:44 floopy kernel: i2c /dev entries driver [... and then everything works fine.] At school, the messages look like this: Apr 12 11:25:29 floopy kernel: b44.c:v0.95 (Aug 3, 2004) Apr 12 11:25:29 floopy kernel: ACPI: PCI Interrupt 0000:02:01.0[A] -> GSI 17 (level, low) -> IRQ 177 Apr 12 11:25:29 floopy kernel: eth0: Broadcom 4400 10/100BaseT Ethernet 00:11:43:67:8a:09 [...] Apr 12 11:25:30 floopy kernel: b44: eth0: Link is down. Apr 12 11:25:30 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:25:30 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 12 11:25:30 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:25:30 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 12 11:25:31 floopy kernel: i2c /dev entries driver Apr 12 11:25:35 floopy kernel: NETDEV WATCHDOG: eth0: transmit timed out Apr 12 11:25:35 floopy kernel: b44: eth0: transmit timed out, resetting Apr 12 11:25:35 floopy kernel: b44: eth0: Link is down. Apr 12 11:25:38 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:25:38 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. [...] Apr 12 11:27:44 floopy kernel: b44: eth0: Link is down. Apr 12 11:27:47 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:27:47 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 12 11:27:51 floopy kernel: b44: eth0: Link is down. Apr 12 11:27:54 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:27:54 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 12 11:27:58 floopy kernel: b44: eth0: Link is down. Apr 12 11:28:01 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:28:01 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 12 11:28:05 floopy kernel: b44: eth0: Link is down. Apr 12 11:28:08 floopy kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Apr 12 11:28:08 floopy kernel: b44: eth0: Flow control is off for TX and off for RX. Apr 12 11:28:13 floopy kernel: b44: eth0: Link is down. [ ... and so on. ] Is there any further information I can give to help debug this? It continues to happen with the 1253 kernel, and it's REALLY annoying ... Is there anywhere I can still get the default FC4T1 kernel from, so I can confirm my recollection that booting with "acpi=off" eliminated this issue with that kernel? If that's true, the changelog from there to now might point at where the problem is coming from ... Same error messages when I connect to my broadband router at home. With the latest kernel for FC3 actually, not FC4T I have test kernels w/ a minor update to the b44 driver here: http://people.redhat.com/linville/kernels/fc3/ Please give them a try and post your results here. Thanks! Those kernels seem to require a "kernel-utils" package -- where should that come from? Okay, "kernel-utils" is an FC3 package and I'm running FC4. I think I've managed to get the same effect via: yum install smartmontools microcode_ctl cpuspeed readahead \ longrun irqbalance x86info rng-utils Had to install the kernel with "--nodeps --oldpackage" too, of course; I'll test it tomorrow. Hmmm...maybe I need to start building FC4 test kernels too... :-) Thanks for your efforts. Let me know if you can't get that FC3 kernel to work (other than the previous b44 problems), and I'll do an FC4 kernel. the kernel-utils dependancy got changed to the hardlink package. As long as you have that installed, you should be safe to --nodeps install it Okay, I've been doing some experimentation with kernels and Grub command-line parameters. I removed all my third-party kernel modules (nVidia, ndiswrapper, ntfs), just in case, although I doubt that would have had any effect. Here are the results; note that my machine has a "hyperthreaded" processor, so I tested both the UP and SMP version of each kernel (and saw no difference between them in any case). 2.6.11-1.1177_FC4 (initial FC4T1 kernel; found on planetmirror.com) - Bug present when booted normally - Bug *ABSENT* when booted with "acpi=off" appended to Grub cmd line 2.6.11-1.1275_FC4 (current Rawhide kernel) - Bug present when booted normally - Bug present when booted with "acpi=off" 2.6.11-1.19_FC3.jwltest.7 (John Linville's test FC3 kernel with b44 patch) - Bug present when booted normally - Bug *ABSENT* when booted with "acpi=off" Hopefully this info helps in tracking down what the problem is. It would be nice if it weren't necessary to use "acpi=off" in the first place, of course. :) This bug still happens with 1286_FC4 (the FC4T3 kernel). Does the fact that "acpi=off" makes it work with 1177 but nothing higher help to track down where the problem is likely to be? It may be useful, but so far it hasn't enlightened anything for me... :-( Have you tried testing with "noapic" either by itself or w/ "acpi=off" as well? Those two seem to commonly go together, and success with one or both of them usually indicates a flaky BIOS -- which begs the question of have you looked for a BIOS update for your motherboard? There are only very minor differences between the b44 driver in my FC3 test kernels (which still work w/ "acpi=off" and the b44 driver in the current rawhide (i.e. FC4testX): --- jwltest-fc3-9/kernel/kernel-2.6.11/linux-2.6.11/drivers/net/b44.c 2005-05-05 16:28:28.000000000 -0400 +++ kernel-rawhide-today/kernel/kernel-2.6.11/linux-2.6.11/drivers/net/b44.c 2005-05-13 10:30:44.977495537 -0400 @@ -1910,7 +1910,7 @@ static void __devexit b44_remove_one(str } } -static int b44_suspend(struct pci_dev *pdev, u32 state) +static int b44_suspend(struct pci_dev *pdev, pm_message_t state) { struct net_device *dev = pci_get_drvdata(pdev); struct b44 *bp = netdev_priv(dev); So, this doesn't look like it is related to the b44 driver per se. Please do tests w/ "noapic" and post the results here. Please also investigate the possibility of a BIOS upgrade for your motherboard. Thanks! Did some more testing, as requested ... it's kind of confusing, but there seem to be configurations that work and ones that don't, so I can live with that. I checked, and according to the Dell site I'm already running the newest BIOS for my machine. With hyperthreading disabled in the BIOS and running kernel 2.6.11-1.1290 (non-SMP): - Works only if I add "acpi=off" to the grub line - "noapic" doesn't seem to make a difference either way With hyperthreading enabled in the BIOS: - The 2.6.11-1.1290 UP kernel works with the default grub line (?!?!) - The 2.6.11-1.1290 SMP kernel doesn't work, even with "acpi=off" and/or "noapic" And, of course, everything works fine when I plug into my router at home; this is all only with the network at work. Very, very weird. Mary, I got a couple of notes from someone that seemed to be having similar issues to what you are seeing. He advises using "acpi=noirq" rather than "acpi=off". Would you mind giving that a try as well? Thanks! "acpi=noirq" doesn't change any of my above results; sorry. (I'm now testing with the 1303 kernel, but the results are the same.) Could you attach the output of running "sysreport"? Thanks! I'm now testing with kernel 1369 -- I haven't actually had the laptop downtown since I last updated this bug (13 May), so I haven't been able to check this recently. But as far as I can tell, networking is now happy even with the SMP kernel and hyperthreading, without any need to add any command-line arguments. Unfortunately, I don't think I'll be able to track the changes backwards to see when it got fixed ... I'll attach the result of running "sysreport" regardless (I ctrl-C'd the RPM query because it was taking forever). Should I set this bug WORKSFORME now? Created attachment 115196 [details]
Result of running "sysreport" on my computer
Is testing only in your current location sufficient to pronounce the problem solved? If you are comfortable closing, that's fine by me... I'll go ahead and close it as CURRENTRELEASE. Feel free to reopen if the problem returns. Thanks! p.s. -- I suspect that this may have been different symptoms of the same underlying problem as https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=156261 -- certainly, it's since the TPM driver (whatever that is?) was disabled in the kernel that my issue has also gone away. I suspected the same thing...maybe you should send me your resume... :-) Seriously, if you'd like to test w/ the kernels from bug 156261 comment 7 and let me know the results, that would be great...thanks! |