Bug 173018
Summary: | No network after yum kernel upgrade installing 3c59x | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jon D. Slater <jon.slater> | ||||||||
Component: | kernel | Assignee: | Chris Lalancette <clalance> | ||||||||
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 4 | CC: | clalance, davej, wtogami | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | i386 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2006-02-06 12:43:17 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Please attach the output of running sysreport on the box in question (while booted under the newer kernel)...thanks! What exactly to you want attached? 'sysreport' generates a .tar.bz2 file. Is that what you want attached? Yes, please! How, exactly, do I "attach" the file? Or, do I e-mail it to you directly? https://bugzilla.redhat.com/bugzilla/attachment.cgi?bugid=173018&action=enter Click the link above, or search for "Create a New Attachment" below... :-) Okay... First things first... I inadvertently updated to version kernel-2.6.14-1.1644_FC4, but the behaviour is exactly the same. So I'm attaching the results of the sysreport from this new kernel. Created attachment 121616 [details]
Result of sysreport command
This is the result of running sysreport on the kernel-2.6.14-1.1644_FC4 kernel.
Although this is not the same kernel for which I reported this bug, the
behavior and error messages are exactly the same.
Remove this line from your /etc/modprobe.conf file: options 3c59x irq=5 Then, reboot with the newer kernel. I think it will load fine. Please post the results here...thanks! Nope, still no network. I removed the line as you suggested. I will re-run sysreport and attach it. Created attachment 121640 [details]
result of new sysreport command
I should also mention the 2.6.12-1.1456_FC4 kernel still works fine, without the 'options 3c59x irq=5' line. Can you define "no network"? The sysreport from comment 10 indicates that the module loaded, and the output of ifconfig shows that the device is up with an IP address assigned. Are you sure there are no problems with your network configuration? If there is a problem with the network configuration then the 2.6.12-1.1456_FC4 kernel handels it gracefully. If I boot using the 2.6.12-1.1456_FC4 kernel, everthing works fine. During boot, everthing looks good until it tries to connect to the Timer Server. (Which is the first indication that the network isn't running.) Looking at the Network configuration gui, the eth0 device is listed as "inactive". But when I try to activate it, it just stays inactive. If there are further reports you'd like to me to run I'm happy to do it. But, it takes me about 1/2 an hour to get to the machines location. Test kernels are available here: http://people.redhat.com/linville/kernels/fc4/ They include the 3c59x driver from 2.6.12-1.1456_FC4 (renamed to 3c59x_old). You will need to modify /etc/modprobe.conf to change all references to "3c59x" to refer to "3c59x_old" instead. Please give that a try and post the results...thanks! BTW, you may also want to try the fedora-netdev kernels: http://people.redhat.com/linville/kernels/fedora-netdev/ You will have to undor the 3c59x->3c59x_old changes in /etc/modprobe.conf if you try these kernels after trying the kernels in comment 15... I just installed the latest kernel (kernel-2.6.14-1.1653_FC4) and all my network problems have gone away. I don't know what you did, but it's working again! Thank you for all your hard work!!! This defect was correct as of kernel: kernel-2.6.14-1.1653_FC4 It has returned with kernel: kernel-2.6.14-1.1656_FC4 Did you ever try the fedora-netdev kernels (comment 16)? There are some big differences between the driver there and the one in the current FC4 kernels... Still no go... :-( First I tried suggestion #15 (that didn't work). So, I put everything back they way it was (kernel-2.6.14-1.1653_FC4). Next, I tried suggestion #16 (that didn't work either). Any other suggestions before I re-format and start over? Thanks! Hi there, Just to be 100% clear, when you say "no network", you mean you can't ping out of the box, or anything like that? It's not just that the time service doesn't work? If I read the above comments correctly, you are saying that kernel-2.6.14-1.1653_FC4 worked properly, but kernel-2.6.14-1.1656_FC4 does not? If that is the case, it is very strange; very little (especially having to do with networking, and the 3c59x driver) changed between those two kernels. However, if that is indeed the case, one of the things that did change has to do with ACPI. Could you try booting kernel-2.6.14-1656_FC4 with adding pci=noacpi to the kernel command line, and see if that makes a difference? Thanks Still not working... But I got a new error messgae Here's what I did: I re-formatted my hard drive and re-installed from my FC4 install disks. This gave me kernel-2.6.11-1.1369_FC4 which works fine. After installing from scratch and testing my network connection, I typed "yum update" This took about two hours but eventually gave me kernel-2.6.14-1.1656_FC4. No network. But this time as the machine boots, right after starting eth0, I get the message: "Disabling IRQ #5" When I check the network status from the network configuration screen, the screen "claims" that eth0 is active. As soon as I reboot using kernel-2.6.11-1.1369_FC4, the network comes back. Does this help? I should mention I tried suggestion 21 again (add pci=noacpi) and it still didn't work. Right now I'm running kernel-2.6.11-1.1369_FC4. I have the machine with me at work, so I can try anything you need me to do with a delay. I mean "without" a delay. OK. Let's try this. Boot up with the 2.6.14-1.1656_FC4. Unload the 3c59x driver. Load the 3c59x driver with: # modprobe 3c59x debug=6 Then, try and bring up the interface. Please attach a full copy of dmesg, so I can get a little better look at what is going on. Thanks! How do I unload the driver? Make sure the interface is stopped: # ifdown eth0 Unload the driver: # rmmod 3c59x Load the driver: # modprobe 3c59x debug=6 Then attach the full output of dmesg. Thanks! Created attachment 123488 [details]
Output after running dmesg
Here is the output from dmesg
OK. Looks like you are having IRQ routing issues. A couple of questions: 1. What is the motherboard, and what is the BIOS version? 2. Are you doing anything funky in the BIOS; i.e. forcing particular interrupts? Since the BIOS version is so old, ACPI is not being used to route the IRQ's, and it seems like it is running into problems. If it is available, you might want to update your BIOS; but if you don't want to do that, or are uncomfortable doing that, I do have a few suggestions. 1. Remove the pci=noacpi option from the kernel command line; it is already being disabled because of it's age anyway. 2. Try adding "acpi=force" to the kernel command-line. I don't necessarily expect this to work, but it is work a shot. 3. If 2 doesn't work, remove "acpi=force" from the kernel command-line, and add "pci=usepirqmask". This is supposed to work around certain bugs in buggy BIOS's, which it seems you might have. 4. If 3 doesn't work, leave the "pci=usepirqmask", but also add "irqpoll" to the kernel command line. Please test 2, 3, and 4, and attach dmesg outputs. Thanks! 1) I can't get a clear look at the motherboard. Is there a command to determine what it is? 2) No, I'm not doing anything in BIOS. So, I've only tried the first suggestion "acpi=force". During boot I see this message: FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.14-1.1656_FC4/kernel/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.ko): No such device BUT THEN THE MACHINE BOOTS AND THE NETWORK IS PRESENT!!!!!!!!! I didn't try the other two suggestions. Is this considered a permanent fix? Thanks so much! The error you listed above about acpi_cpufreq is just a problem with the cpuspeed daemon; if you don't want to see it anymore, just run "chkconfig cpuspeed off". Assuming that your network continues running, and everything else on the machine is OK, that will probably be the fix for you. The only other option is to possibly upgrade the BIOS (like I mentioned above), but given that this is working for you I wouldn't really recommend it. The next time I run "yum update" and get a new kernel, will I have to manually add the "acpi=force"? Or will the update some-how figure it out for me? As far as I know, when a kernel update happens, it takes the kernel command line from the previous kernel that is in GRUB and copies it; so the next kernel upgrade should automatically get the "acpi=force". I won't guarantee it, but I am pretty sure you won't need to do it again. Well, before you close this, I just wanted to thank you for all of your patience and support! what we can do, is whitelist your system so that it uses acpi by default, so that the addition isn't necessary. Can you attach the output of 'dmidecode' (run it as root) please ? Thanks. This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you. So, here's the latest... 1) I updated to the new 2.6.15-1.1830_FC4 kernel. 2) Rebooted 3) No network... 4) Drove 1/2 hour across town to the machines location. 5) Tried all three variations from #29 (above) 6) Still no network. What I see happen during boot, right after the line "Starting eth0", is a message "Disabling IRQ 5". If I look at the network configureation screen, it "claims" that eth0 is active. But, I can't get in or out. SOLVED!!! Thank you Christopher Lalancette! The one suggestion (that I was most afraid to try) was to update the bios. So, having tried everything else you suggested, I broke down and visited the HP web site, and found the latest and greatest BIOS. After upgrading the bios, all the problems went away. I have been able to remove all of the extra kernel options that you suggested in #29 above. This problem is now solved! Thank you so much Chris!!!! |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7 Description of problem: >>>Today I did a "yum update" and got kernel-2.6.14-1.1637_FC4 (updated >>>from kernel-2.6.12-1.1456_FC4). >>> >>>Rebooted, and I lost my network connection. >>> >>>During the boot to '1637' and from the network admin screen, it claims >>>to not be able to find my eth0 device (3Com 3c905). >>> >>>But when I boot using '1456', it finds eth0 just fine. >>> >>>I'm running on an HP Pavilion 8150. >>> >>>Thanks! >>> >>>Jon >>> >>> >> >>I've got the same problem, with a message saying that the setup of eth0 is delayed because of a lack of a module (?) >>forcedeth that should work with NVidia chipsets. >>But I can't understand the steps to include this module I've found here >>and there. >> Here's a bit more information from the 'messages' file: Nov 11 09:29:59 lambdacenter ntpd[1794]: sendto(66.187.233.4): Network is unreachable Nov 11 09:30:25 lambdacenter kernel: 3c59x: Unknown parameter `irq' Nov 11 09:30:25 lambdacenter modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg) < removed a bunch of repeated messages here > Nov 11 09:30:25 lambdacenter kernel: 3c59x: Unknown parameter `irq' Nov 11 09:30:25 lambdacenter modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg) Nov 11 09:30:25 lambdacenter modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg) Version-Release number of selected component (if applicable): kernel-2.6.14-1.1637_FC4 How reproducible: Always Steps to Reproduce: 1. Upgrade from kernel-2.6.12-1.1456_FC4 to kernel-2.6.12-1.1456_FC4 2. Reboot 3. modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg) Actual Results: No access to the network. Eth0 not found. Expected Results: Eth0 should have been found, and allowed access to the network. Additional info: The problem does not exist in kernel-2.6.12-1.1456_FC4