Bug 186584 - Oops after network setup during boot
Summary: Oops after network setup during boot
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-24 15:56 UTC by Chris Adams
Modified: 2015-01-04 22:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-28 01:42:25 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
boot log and oops (21.68 KB, text/plain)
2006-03-24 15:56 UTC, Chris Adams
no flags Details
log from 64 bit kernel (47.41 KB, text/plain)
2006-03-24 17:44 UTC, Chris Adams
no flags Details
dmesg output (20.73 KB, text/plain)
2006-03-25 22:03 UTC, Chris Adams
no flags Details

Description Chris Adams 2006-03-24 15:56:10 UTC
On a new FC5 system, I cannot boot into anything other than single user mode. 
During a boot to runlevel 3 (with no "rhgb quiet" boot), I get oopses during or
right after network setup.

This is an Abit AV8 socket 939 mboard.  I am using the built-in gigabit
(via_velocity, listed as eth2) as well as a Compaq dual 10/100 PCI card (e100,
creates eth0 and eth1).  I have only configured eth0 and eth2 at this point.

When I boot in single user mode, I can ifup eth0 and ifup eth2, but at that
point, pretty much anything causes an oops.  I captured one with the full boot
messages (will attach); it oopsed when I ran "ps".

I tried loading the updates/testing/5 kernel and got the same behavior.

Comment 1 Chris Adams 2006-03-24 15:56:11 UTC
Created attachment 126639 [details]
boot log and oops

Comment 2 Chris Adams 2006-03-24 17:44:28 UTC
Created attachment 126655 [details]
log from 64 bit kernel

Here's a boot log from a minimal 64 bit install.  The common thing I see is the
"last sysfs file: /class/net/eth2/address" (eth2 is the via_velocity
interface).

Comment 3 Dave Jones 2006-03-24 23:31:00 UTC
can you give that box a going over with memtest86 ?
The value it oopsed on is kinda strange, and could be a random bit flip that
would force us to bypass a null ptr check.


Comment 4 Chris Adams 2006-03-24 23:39:11 UTC
I'll give it a try after I post this.  However, this box has been running FC3
(32 bit) for a while okay (I did the FC5 installs to alternate partitions).  I
also figured that if it was a RAM thing, I wouldn't have oopses at essentially
the same stage of boot on both 32 bit and 64 bit installs (since RAM use would
be significantly different).


Comment 5 Jay Cliburn 2006-03-25 02:43:13 UTC
I have the same motherboard, and hence the same onboard nic, and it seems to
work fine under my FC5 installation.

Comment 6 Chris Adams 2006-03-25 13:16:22 UTC
memtest86+ ran for 13 hours with no errors.

I'll try pulling the e100 NIC and see what happens.


Comment 7 Chris Adams 2006-03-25 22:03:58 UTC
Created attachment 126741 [details]
dmesg output

Okay, I pulled the e100 NIC.  The only card plugged in is the video (OEM ATI
Radeon 9200 AGP).  I moved the ifcfg-eth[01] out of the way and renamed
ifcfg-eth2 to eth0 (and modified the file), and modified modprobe.conf to only
reference eth0 (as via_velocity).

It still crashed during boot.  To get a log, I booted in single user mode and
started running S* scripts from rc3.d manually.  After starting S10network, I
got a line in dmesg output about sed segfaulting.  I continued to S12syslog and
got a general protection fault.

The system was still running at that point (I was able to copy off the dmesg
output), so I went on with no further problems until I started bluetooth.  At
that point, the kernel paniced (and it scrolled off the screen).

I'm attaching the log up through the syslog GPF.  If it is really needed, I can
try to get the later panic, but it'll take some time (the system has a serial
port but nothing else has one so I'll have to find a USB adapter for another
system).  Let me know if you need it or if you can make some sense from the
attached dmesg output.

If there's anything else I can try, let me know.

Comment 8 Chris Adams 2006-03-27 00:12:58 UTC
Okay, I found the culprit.  The network bit was a red herring; the real problem
was something that started earlier: cpuspeed.  When I disable the "Cool 'n
Quiet" option in my BIOS, the system boots with no problems.

Comment 9 Chris Adams 2006-03-27 02:17:52 UTC
I exchanged some more email with Jay Cliburn and compared systems.  We are
running the same BIOS version (and I also tried a new version just released last
week).  He's got a 3000+ CPU while I've got a 3200+.  Cool 'n Quiet works for
him (and cpuspeed works under Linux), while mine crashes.


Comment 10 Chris Adams 2006-03-28 01:42:25 UTC
I guess chalk this up to "user error" (although I'll blame the manufacturer).

When I built the system, I installed my 2 sticks of RAM in slots 3 and 4.  When
I move them to slots 1 and 2, the system appears to work.  I can't fully switch
to FC5 just yet (probably this weekend), but on FC3 I went ahead and loaded
powernow-k8 and started cpuspeed, and it is working there as well.

I blame Abit because they put a sticker over the RAM slot labels on the
motherboard.  I can see DIM on each slot but no numbers.

Bugzilla needs a PEBKAM resolution state.



Note You need to log in before you can comment on or make changes to this bug.