Bug 173018 - No network after yum kernel upgrade installing 3c59x
No network after yum kernel upgrade installing 3c59x
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
4
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Chris Lalancette
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-11-12 11:08 EST by Jon D. Slater
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-06 07:43:17 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Result of sysreport command (1.23 MB, application/octet-stream)
2005-11-29 17:48 EST, Jon D. Slater
no flags Details
result of new sysreport command (1.60 MB, application/octet-stream)
2005-11-30 12:47 EST, Jon D. Slater
no flags Details
Output after running dmesg (14.78 KB, text/plain)
2006-01-20 10:39 EST, Jon D. Slater
no flags Details

  None (edit)
Description Jon D. Slater 2005-11-12 11:08:32 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7

Description of problem:
>>>Today I did a "yum update" and got kernel-2.6.14-1.1637_FC4 (updated 
>>>from  kernel-2.6.12-1.1456_FC4).
>>>
>>>Rebooted, and I lost my network connection.
>>>
>>>During the boot to '1637' and from the network admin screen, it claims 
>>>to not be able to find my eth0 device (3Com 3c905).
>>>
>>>But when I boot using '1456', it finds eth0 just fine.
>>>
>>>I'm running on an HP Pavilion 8150.
>>>
>>>Thanks!
>>>
>>>Jon
>>>    
>>>
>>
>>I've got the same problem, with a message saying that the setup of eth0 is delayed because of a lack of a module (?)
>>forcedeth that should work with NVidia chipsets.
>>But I can't understand the steps to include this module I've found here
>>and there.
>>

Here's a bit more information from the 'messages' file:

Nov 11 09:29:59 lambdacenter ntpd[1794]: sendto(66.187.233.4): Network is unreachable
Nov 11 09:30:25 lambdacenter kernel: 3c59x: Unknown parameter `irq'
Nov 11 09:30:25 lambdacenter modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg)

< removed a bunch of repeated messages here >

Nov 11 09:30:25 lambdacenter kernel: 3c59x: Unknown parameter `irq'
Nov 11 09:30:25 lambdacenter modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg)
Nov 11 09:30:25 lambdacenter modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg)

Version-Release number of selected component (if applicable):
kernel-2.6.14-1.1637_FC4

How reproducible:
Always

Steps to Reproduce:
1. Upgrade from kernel-2.6.12-1.1456_FC4 to kernel-2.6.12-1.1456_FC4
2. Reboot
3. modprobe: FATAL: Error inserting 3c59x (/lib/modules/2.6.14-1.1637_FC4/kernel/drivers/net/3c59x.ko): Unknown symbol in module, or unknown parameter (see dmesg)
  

Actual Results:  No access to the network.  Eth0 not found.

Expected Results:  Eth0 should have been found, and allowed access to the network.

Additional info:

The problem does not exist in kernel-2.6.12-1.1456_FC4
Comment 1 John W. Linville 2005-11-29 15:15:10 EST
Please attach the output of running sysreport on the box in question (while 
booted under the newer kernel)...thanks! 
Comment 2 Jon D. Slater 2005-11-29 16:36:35 EST
What exactly to you want attached?  'sysreport' generates a .tar.bz2 file.

Is that what you want attached?
Comment 3 John W. Linville 2005-11-29 16:38:39 EST
Yes, please! 
Comment 4 Jon D. Slater 2005-11-29 16:40:08 EST
How, exactly, do I "attach" the file?  Or, do I e-mail it to you directly?
Comment 5 John W. Linville 2005-11-29 17:01:09 EST
https://bugzilla.redhat.com/bugzilla/attachment.cgi?bugid=173018&action=enter  
  
Click the link above, or search for "Create a New Attachment" below... :-)  
Comment 6 Jon D. Slater 2005-11-29 17:44:09 EST
Okay... First things first...  I inadvertently updated to version
kernel-2.6.14-1.1644_FC4, but the behaviour is exactly the same.  So I'm
attaching the results of the sysreport from this new kernel.
Comment 7 Jon D. Slater 2005-11-29 17:48:18 EST
Created attachment 121616 [details]
Result of sysreport command

This is the result of running sysreport on the kernel-2.6.14-1.1644_FC4 kernel.
 Although this is not the same kernel for which I reported this bug, the
behavior and error messages are exactly the same.
Comment 8 John W. Linville 2005-11-30 10:26:55 EST
Remove this line from your /etc/modprobe.conf file:  
  
   options 3c59x  irq=5  
 
Then, reboot with the newer kernel.  I think it will load fine.  Please post 
the results here...thanks! 
Comment 9 Jon D. Slater 2005-11-30 12:09:22 EST
Nope, still no network.  I removed the line as you suggested.  I will re-run
sysreport and attach it.
Comment 10 Jon D. Slater 2005-11-30 12:47:19 EST
Created attachment 121640 [details]
result of new sysreport command
Comment 11 Jon D. Slater 2005-11-30 13:12:57 EST
I should also mention the 2.6.12-1.1456_FC4 kernel still works fine, without the
'options 3c59x irq=5' line.
Comment 12 John W. Linville 2005-12-02 12:58:19 EST
Can you define "no network"?  The sysreport from comment 10 indicates that the 
module loaded, and the output of ifconfig shows that the device is up with an 
IP address assigned.  Are you sure there are no problems with your network 
configuration? 
Comment 13 Jon D. Slater 2005-12-02 13:01:48 EST
If there is a problem with the network configuration then the 2.6.12-1.1456_FC4
kernel handels it gracefully.  If I boot using the 2.6.12-1.1456_FC4 kernel,
everthing works fine.
Comment 14 Jon D. Slater 2005-12-02 13:05:32 EST
During boot, everthing looks good until it tries to connect to the Timer Server.
 (Which is the first indication that the network isn't running.)

Looking at the Network configuration gui, the eth0 device is listed as
"inactive".  But when I try to activate it, it just stays inactive.

If there are further reports you'd like to me to run I'm happy to do it.  But,
it takes me about 1/2 an hour to get to the machines location.
Comment 15 John W. Linville 2005-12-08 16:13:21 EST
Test kernels are available here:

   http://people.redhat.com/linville/kernels/fc4/

They include the 3c59x driver from 2.6.12-1.1456_FC4 (renamed to 3c59x_old). 
You will need to modify /etc/modprobe.conf to change all references to "3c59x"
to refer to "3c59x_old" instead.

Please give that a try and post the results...thanks!
Comment 16 John W. Linville 2005-12-08 16:16:56 EST
BTW, you may also want to try the fedora-netdev kernels:

   http://people.redhat.com/linville/kernels/fedora-netdev/

You will have to undor the 3c59x->3c59x_old changes in /etc/modprobe.conf if you
try these kernels after trying the kernels in comment 15...
Comment 17 Jon D. Slater 2005-12-18 20:20:13 EST
I just installed the latest kernel (kernel-2.6.14-1.1653_FC4) and all my network
problems have gone away.

I don't know what you did, but it's working again!

Thank you for all your hard work!!!
Comment 18 Jon D. Slater 2006-01-09 16:34:01 EST
This defect was correct as of kernel: kernel-2.6.14-1.1653_FC4

It has returned with kernel: kernel-2.6.14-1.1656_FC4
Comment 19 John W. Linville 2006-01-10 15:58:46 EST
Did you ever try the fedora-netdev kernels (comment 16)?  There are some big 
differences between the driver there and the one in the current FC4 kernels... 
Comment 20 Jon D. Slater 2006-01-19 15:32:38 EST
Still no go...  :-(

First I tried suggestion #15 (that didn't work).

So, I put everything back they way it was (kernel-2.6.14-1.1653_FC4).

Next, I tried suggestion #16 (that didn't work either).

Any other suggestions before I re-format and start over?

Thanks!
Comment 21 Chris Lalancette 2006-01-19 17:02:14 EST
Hi there,
     Just to be 100% clear, when you say "no network", you mean you can't ping
out of the box, or anything like that?  It's not just that the time service
doesn't work?
     If I read the above comments correctly, you are saying that
kernel-2.6.14-1.1653_FC4 worked properly, but kernel-2.6.14-1.1656_FC4 does not?
 If that is the case, it is very strange; very little (especially having to do
with networking, and the 3c59x driver) changed between those two kernels. 
However, if that is indeed the case, one of the things that did change has to do
with ACPI.  Could you try booting kernel-2.6.14-1656_FC4 with adding pci=noacpi
to the kernel command line, and see if that makes a difference?

Thanks
Comment 22 Jon D. Slater 2006-01-19 21:42:06 EST
Still not working...  But I got a new error messgae

Here's what I did:

I re-formatted my hard drive and re-installed from my FC4 install disks.
This gave me kernel-2.6.11-1.1369_FC4 which works fine.

After installing from scratch and testing my network connection, I typed "yum
update"

This took about two hours but eventually gave me kernel-2.6.14-1.1656_FC4.

No network.

But this time as the machine boots, right after starting eth0, I get the message:
"Disabling IRQ #5"

When I check the network status from the network configuration screen, the
screen "claims" that eth0 is active.

As soon as I reboot using kernel-2.6.11-1.1369_FC4, the network comes back.

Does this help?
Comment 23 Jon D. Slater 2006-01-19 22:37:19 EST
I should mention I tried suggestion 21 again (add pci=noacpi) and it still
didn't work.

Right now I'm running kernel-2.6.11-1.1369_FC4.

I have the machine with me at work, so I can try anything you need me to do with
a delay.
Comment 24 Jon D. Slater 2006-01-20 00:00:34 EST
I mean "without" a delay.
Comment 25 Chris Lalancette 2006-01-20 09:45:40 EST
OK.  Let's try this.  Boot up with the 2.6.14-1.1656_FC4.  Unload the 3c59x
driver.  Load the 3c59x driver with:

# modprobe 3c59x debug=6

Then, try and bring up the interface.  Please attach a full copy of dmesg, so I
can get a little better look at what is going on.

Thanks!
Comment 26 Jon D. Slater 2006-01-20 10:08:44 EST
How do I unload the driver?
Comment 27 Chris Lalancette 2006-01-20 10:18:34 EST
Make sure the interface is stopped:

# ifdown eth0

Unload the driver:

# rmmod 3c59x

Load the driver:

# modprobe 3c59x debug=6

Then attach the full output of dmesg.

Thanks!
Comment 28 Jon D. Slater 2006-01-20 10:39:05 EST
Created attachment 123488 [details]
Output after running dmesg

Here is the output from dmesg
Comment 29 Chris Lalancette 2006-01-20 10:59:45 EST
OK.  Looks like you are having IRQ routing issues.  A couple of questions:

1.  What is the motherboard, and what is the BIOS version?
2.  Are you doing anything funky in the BIOS; i.e. forcing particular interrupts?

Since the BIOS version is so old, ACPI is not being used to route the IRQ's, and
it seems like it is running into problems.  If it is available, you might want
to update your BIOS; but if you don't want to do that, or are uncomfortable
doing that, I do have a few suggestions.

1.  Remove the pci=noacpi option from the kernel command line; it is already
being disabled because of it's age anyway.
2.  Try adding "acpi=force" to the kernel command-line.  I don't necessarily
expect this to work, but it is work a shot.
3.  If 2 doesn't work, remove "acpi=force" from the kernel command-line, and add
"pci=usepirqmask".  This is supposed to work around certain bugs in buggy
BIOS's, which it seems you might have.
4.  If 3 doesn't work, leave the "pci=usepirqmask", but also add "irqpoll" to
the kernel command line.

Please test 2, 3, and 4, and attach dmesg outputs.

Thanks!
Comment 30 Jon D. Slater 2006-01-20 11:23:02 EST
1)  I can't get a clear look at the motherboard.  Is there a command to
determine what it is?

2)  No, I'm not doing anything in BIOS.

So, I've only tried the first suggestion "acpi=force".

During boot I see this message:
FATAL: Error inserting acpi_cpufreq
(/lib/modules/2.6.14-1.1656_FC4/kernel/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.ko):
No such device

BUT THEN THE MACHINE BOOTS AND THE NETWORK IS PRESENT!!!!!!!!!

I didn't try the other two suggestions.

Is this considered a permanent fix?

Thanks so much!
Comment 31 Chris Lalancette 2006-01-20 11:37:53 EST
The error you listed above about acpi_cpufreq is just a problem with the
cpuspeed daemon; if you don't want to see it anymore, just run "chkconfig
cpuspeed off".

Assuming that your network continues running, and everything else on the machine
is OK, that will probably be the fix for you.  The only other option is to
possibly upgrade the BIOS (like I mentioned above), but given that this is
working for you I wouldn't really recommend it.
Comment 32 Jon D. Slater 2006-01-20 11:44:04 EST
The next time I run "yum update" and get a new kernel, will I have to manually
add the "acpi=force"?

Or will the update some-how figure it out for me?
Comment 33 Chris Lalancette 2006-01-20 11:48:57 EST
As far as I know, when a kernel update happens, it takes the kernel command line
from the previous kernel that is in GRUB and copies it; so the next kernel
upgrade should automatically get the "acpi=force".  I won't guarantee it, but I
am pretty sure you won't need to do it again.
Comment 34 Jon D. Slater 2006-01-20 12:02:06 EST
Well, before you close this, I just wanted to thank you for all of your patience
and support!
Comment 35 Dave Jones 2006-01-24 01:48:58 EST
what we can do, is whitelist your system so that it uses acpi by default, so
that the addition isn't necessary.  Can you attach the output of 'dmidecode'
(run it as root) please ?


Thanks.
Comment 36 Dave Jones 2006-02-03 02:23:50 EST
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.
Comment 37 Jon D. Slater 2006-02-03 13:04:33 EST
So, here's the latest...

1)  I updated to the new 2.6.15-1.1830_FC4 kernel.

2)  Rebooted

3)  No network...

4)  Drove 1/2 hour across town to the machines location.

5)  Tried all three variations from #29 (above)

6)  Still no network.

What I see happen during boot, right after the line "Starting eth0", is a
message "Disabling IRQ 5".

If I look at the network configureation screen, it "claims" that eth0 is active.

But, I can't get in or out.
Comment 38 Jon D. Slater 2006-02-03 20:06:32 EST
SOLVED!!!

Thank you Christopher Lalancette!

The one suggestion (that I was most afraid to try) was to update the bios.

So, having tried everything else you suggested, I broke down and visited the HP
web site, and found the latest and greatest BIOS.

After upgrading the bios, all the problems went away.  I have been able to
remove all of the extra kernel options that you suggested in #29 above.

This problem is now solved!

Thank you so much Chris!!!!

Note You need to log in before you can comment on or make changes to this bug.