On a new FC5 system, I cannot boot into anything other than single user mode. During a boot to runlevel 3 (with no "rhgb quiet" boot), I get oopses during or right after network setup. This is an Abit AV8 socket 939 mboard. I am using the built-in gigabit (via_velocity, listed as eth2) as well as a Compaq dual 10/100 PCI card (e100, creates eth0 and eth1). I have only configured eth0 and eth2 at this point. When I boot in single user mode, I can ifup eth0 and ifup eth2, but at that point, pretty much anything causes an oops. I captured one with the full boot messages (will attach); it oopsed when I ran "ps". I tried loading the updates/testing/5 kernel and got the same behavior.
Created attachment 126639 [details] boot log and oops
Created attachment 126655 [details] log from 64 bit kernel Here's a boot log from a minimal 64 bit install. The common thing I see is the "last sysfs file: /class/net/eth2/address" (eth2 is the via_velocity interface).
can you give that box a going over with memtest86 ? The value it oopsed on is kinda strange, and could be a random bit flip that would force us to bypass a null ptr check.
I'll give it a try after I post this. However, this box has been running FC3 (32 bit) for a while okay (I did the FC5 installs to alternate partitions). I also figured that if it was a RAM thing, I wouldn't have oopses at essentially the same stage of boot on both 32 bit and 64 bit installs (since RAM use would be significantly different).
I have the same motherboard, and hence the same onboard nic, and it seems to work fine under my FC5 installation.
memtest86+ ran for 13 hours with no errors. I'll try pulling the e100 NIC and see what happens.
Created attachment 126741 [details] dmesg output Okay, I pulled the e100 NIC. The only card plugged in is the video (OEM ATI Radeon 9200 AGP). I moved the ifcfg-eth[01] out of the way and renamed ifcfg-eth2 to eth0 (and modified the file), and modified modprobe.conf to only reference eth0 (as via_velocity). It still crashed during boot. To get a log, I booted in single user mode and started running S* scripts from rc3.d manually. After starting S10network, I got a line in dmesg output about sed segfaulting. I continued to S12syslog and got a general protection fault. The system was still running at that point (I was able to copy off the dmesg output), so I went on with no further problems until I started bluetooth. At that point, the kernel paniced (and it scrolled off the screen). I'm attaching the log up through the syslog GPF. If it is really needed, I can try to get the later panic, but it'll take some time (the system has a serial port but nothing else has one so I'll have to find a USB adapter for another system). Let me know if you need it or if you can make some sense from the attached dmesg output. If there's anything else I can try, let me know.
Okay, I found the culprit. The network bit was a red herring; the real problem was something that started earlier: cpuspeed. When I disable the "Cool 'n Quiet" option in my BIOS, the system boots with no problems.
I exchanged some more email with Jay Cliburn and compared systems. We are running the same BIOS version (and I also tried a new version just released last week). He's got a 3000+ CPU while I've got a 3200+. Cool 'n Quiet works for him (and cpuspeed works under Linux), while mine crashes.
I guess chalk this up to "user error" (although I'll blame the manufacturer). When I built the system, I installed my 2 sticks of RAM in slots 3 and 4. When I move them to slots 1 and 2, the system appears to work. I can't fully switch to FC5 just yet (probably this weekend), but on FC3 I went ahead and loaded powernow-k8 and started cpuspeed, and it is working there as well. I blame Abit because they put a sticker over the RAM slot labels on the motherboard. I can see DIM on each slot but no numbers. Bugzilla needs a PEBKAM resolution state.