Bug 179714 - kernel 2.6.15-1.1884 Oops on Dell Inspiron 8600.
Summary: kernel 2.6.15-1.1884 Oops on Dell Inspiron 8600.
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Brian Brock
URL:
Whiteboard:
: 183064 (view as bug list)
Depends On:
Blocks: 181761
TreeView+ depends on / blocked
 
Reported: 2006-02-02 11:05 UTC by Steven Haigh
Modified: 2007-11-30 22:11 UTC (History)
5 users (show)

Fixed In Version: kernel-2.6.15-1.2032_FC5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-09 19:11:11 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
First crash screen with 120 second pause (184.58 KB, image/jpeg)
2006-02-04 02:59 UTC, Steven Haigh
no flags Details
Final death screen. Nothing happens after this point. (176.47 KB, image/jpeg)
2006-02-04 03:00 UTC, Steven Haigh
no flags Details
Initial kernel oops. (116.69 KB, image/jpeg)
2006-02-08 14:12 UTC, Steven Haigh
no flags Details
After the 120 second pause, system stops here (after 3-4 minutes) (104.99 KB, image/jpeg)
2006-02-08 14:13 UTC, Steven Haigh
no flags Details
FC5 Test 3 on Dell Latitude D800 (565.98 KB, image/jpeg)
2006-02-22 01:50 UTC, Ioannis
no flags Details
Latest kernel oops on kernel-2.6.15-1.2009.4.2_FC5. (149.68 KB, image/jpeg)
2006-03-06 19:00 UTC, Steven Haigh
no flags Details

Description Steven Haigh 2006-02-02 11:05:16 UTC
When booting off kernel-2.6.15-1.1884_FC5 the system sets the clock, tries to
start udev, then dies.

The crash output states:
Unable to handle kernel paging request at virtual address f8000022 printing eip:
*pde = 00000000
Oops: 0000 [#1]
last sysfs file: /devices/pci0000:00/0000:00:1f.1/ide0/0.0/modalias
-- snipped a heap of modules to do with sound, network and wifi --
CPU: 0
EIP: 0060:[<f8000022>] Not tainted VLI
EFLAGS: 00010282 (2.6.15-1.1884_FC5)
EIP is at 0xf8000022
eax: 00000000 ebx: 00000020 ecx: 0000000 edx: f88875fa
esi: 0000142d edi: 12000021 ebp: 00001432 esp: f74ebd30
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 688, threadinfo=f75b1000 task=f750a000)
Stack: <0>00001437 12000023 0000143c 00000024 00001441 F7000025 00001446 00000026
0000144b 00000027 00001450 00000028 00001455 f7000029 0000145a f700002a
0000145f ff00002b 00001464 0000002c 00001469 0000002d 0000146e 0000002e
Call Trace:
Code:  00 00 00 00 00 00 00 00 Bad EIP value.

The system then attempts to boot after 120 seconds, but doesn't get very far
before falling in a heap.

I've booted successfully into a previous kernel version (2.6.14-1.1656_FC4)
however I can't find any information that seems to be dumped to any files to get
more details. Are there any other details you require, and how can I gather them?

Comment 1 Steven Haigh 2006-02-03 11:21:20 UTC
This error also occurs on the latest 2.6.15-1.1895_FC5 kernel for i686.

Comment 2 Dave Jones 2006-02-03 20:27:43 UTC
There was no text between the call trace & Code: line ?
(If there was, and you don't want to transcribe it all, a digital camera photo
of the screen is adequate if you have one -- though boot with vga=791 or vga=1
to get more lines of text onscreen).


Comment 3 Steven Haigh 2006-02-04 02:59:18 UTC
Created attachment 124146 [details]
First crash screen with 120 second pause

Comment 4 Steven Haigh 2006-02-04 03:00:15 UTC
Created attachment 124147 [details]
Final death screen. Nothing happens after this point.

Comment 5 Steven Haigh 2006-02-04 03:03:59 UTC
The two above pics of the screen at the crash shows what's going on. I've
updated today to kernel 2.6.15-1.1898_FC5. The crash has changed from the
original one (that didn't have anything between the call trace and code lines) -
however the machine is still unbootable with this kernel.

Comment 6 Steven Haigh 2006-02-05 07:52:31 UTC
This also happens with kernel-2.6.15-1.1907_FC5

Comment 7 Steven Haigh 2006-02-06 14:59:44 UTC
Just updated & confirmed same issue in kernel-2.6.15-1.1909_FC5

Comment 8 Steven Haigh 2006-02-08 14:12:04 UTC
Created attachment 124381 [details]
Initial kernel oops.

Comment 9 Steven Haigh 2006-02-08 14:13:11 UTC
Created attachment 124382 [details]
After the 120 second pause, system stops here (after 3-4 minutes)

Comment 10 Steven Haigh 2006-02-08 14:15:36 UTC
Todays update to kernel-2.6.15-1.1914_FC5 saw a change in the Oops message.
updated images to reflect changes with this build of the kernel.

Comment 11 Steven Haigh 2006-02-10 05:40:42 UTC
Updated today to current rawhide kernel 2.6.15-1.1917_FC5. Booting using vga=791
doesn't reset the font size anymore to the font on the pics... transcript of the
crash follows:

Unable to handle kernel paging request at virtual address 00001621
 printing eip:
*pde = 3f6e0067
Oops: 0000 [#1]
last sysfs file: /devices/pci0000:00/0000:00:1d.2/modalias
Modules linked in: bcm43xx ieee80211softmac ieee80211 joydev ieee80211_crypt b44
mii snd_intel8x0 dns_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd soundcore snd_page_alloc ext3 jbd
CPU: 0
EIP: 0060:[<00001621>] Not tainted VLI
EFLAGS: 00010282 (2.6.15-1.1917_FC5)
EIP is at 0x1621
eax: 00000000 ebx: 00001617 ecx: 00000000 edx: 00000000
esi: c18f2c83 edi: 0000161c ebp: 12060184 esp: c1951d68
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 657, threadinfo=c1951000 task=f7f6c550)
Stack: <0>00001085 00001626 c1947986 0000162b 00000087 00001630 c18f2d88 00001635
00000289 0000163a 0000008a 0000163f 0000008b 00001644 c18f208c 00001649
c18f208d 0000164e 0000008e 00001653 0000008f 00001658 c18f2c90 0000165d
Call Trace:
 [<c01cca95>] pci_device_probe+0x34/0x57    [<c0228097>] device_attach+0x10/0x65
 [<c0228199>] device_release_driver+0x22/0x38   [<c0227b9b>] bus_unregister+0xf/0x45
 [<c0227f9f>] device_bind_driver+0x0/0x58    [<c02278a0>] bus_add_driver+0xae/0xfd
 [<c0130da7>] sys_init_module+0x1342/0x152d    [<c0129abf>]
wake_bit_function+0x7/0x3c
 [<c0153bc7>] do_sync_write+0xdb/0xf3    [<c0154565>] vfs_read+0x9f/0x13e
 [<c01548cc>] sys_read+0x3c/0x63    [<c0102ba9>] syscall_call+0x7/0xb
Code: Bad EIP value.

Hope this helps a bit more now :)

Comment 12 Steven Haigh 2006-02-15 12:34:52 UTC
Same crash for kernel-2.6.15-1.1939_FC5.

Comment 13 Dave Jones 2006-02-19 03:21:02 UTC
can you attach output of lspci please ? I'm curious what is at 0000:00:1d.2

Comment 14 Steven Haigh 2006-02-19 04:01:04 UTC
# lspci
00:00.0 Host bridge: Intel Corporation 82855PM Processor to I/O Controller (rev 03)
00:01.0 PCI bridge: Intel Corporation 82855PM Processor to AGP Controller (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge
(rev 01)
00:1f.1 IDE interface: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev 01)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
00:1f.6 Modem: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97
Modem Controller (rev 01)
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 [Mobility Radeon
9600 M10]
02:00.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01)
02:01.0 CardBus bridge: Texas Instruments PCI4510 PC card Cardbus Controller
(rev 02)
02:01.1 FireWire (IEEE 1394): Texas Instruments PCI4510 IEEE-1394 Controller
02:03.0 Network controller: Broadcom Corporation BCM4309 802.11a/b/g (rev 02)


Comment 15 Ioannis 2006-02-22 01:50:52 UTC
Created attachment 124998 [details]
FC5 Test 3 on Dell Latitude D800

Comment 16 Ioannis 2006-02-22 01:53:51 UTC
Comment on attachment 124998 [details]
FC5 Test 3 on Dell Latitude D800

first error message

Comment 17 Ioannis 2006-02-22 01:57:42 UTC
as the new attachment shows this error appears in Fedora Core 5 Test 3 on a dell
Latitude D800 laptop as well. I've tested FC5 Test2 on two identical D800
laptops and came up with the same results.

Comment 18 Richard Körber 2006-02-22 08:36:22 UTC
I can confirm this bug on a Dell Inspiron 510m and FC5test3.

Comment 19 Steven Haigh 2006-02-24 04:42:21 UTC
I can confirm this still happens with kernel-2.6.15-1.1975_FC5

The latest bootable kernel I have found is kernel-2.6.15-1.1824_FC4

Comment 20 Ioannis 2006-02-24 23:18:02 UTC
kernel-2.6.15-1.1824_FC4 (i686) is bootable on my D800 as well under FC5 test3.
Thanks for the tip.

Comment 21 Steven Haigh 2006-02-27 11:10:30 UTC
I believe this may have something to do with the 802.11a/b/g wireless mini PCI
card. I have ndiswrapper installed, and I'm wondering out loud if ndis wrapper
and the experimental broadcom wifi drivers could cause this issue (if in fact
drivers are present!). I might be way off base here - as I'm only thinking out loud.

Do any of the other people having this issue have the same 802.11a/b/g card?

Comment 22 Stephen John Smoogen 2006-02-28 19:28:44 UTC
The problem seems to be with the broadcom minipci card and the current unstable
driver. Not sure if the problem is with the miniPCI controller unit or the card
itself. I think the driver needs a refresh with the upstream?

Comment 23 Ioannis 2006-03-01 00:11:30 UTC
I was leading to the same conclusion. I don't have any hard evidence though, it
is more like a hunch. My latitude D800 has the exact same wifi card as Steven's
laptop(inspiron 8600). This is my (D800) pci setup:

00:00.0 Host bridge: Intel Corporation 82855PM Processor to I/O Controller (rev 03)
00:01.0 PCI bridge: Intel Corporation 82855PM Processor to AGP Controller (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge
(rev 01)
00:1f.1 IDE interface: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev 01)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
00:1f.6 Modem: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97
Modem Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV28 [GeForce4 Ti 4200 Go
AGP 8x] (rev a1)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705M Gigabit
Ethernet (rev 01)
02:01.0 CardBus bridge: Texas Instruments PCI7510 PC card Cardbus Controller
(rev 01)
02:01.1 CardBus bridge: Texas Instruments PCI7510,7610 PC card Cardbus
Controller (rev 01)
02:01.2 FireWire (IEEE 1394): Texas Instruments PCI7410,7510,7610 OHCI-Lynx
Controller
02:01.3 System peripheral: Texas Instruments PCI7410,7510,7610 PCI Firmware
Loading Function
02:03.0 Network controller: Broadcom Corporation BCM4309 802.11a/b/g (rev 02)

bear in mind that the vanila linux kernel 2.6.15-4 does boot on my D800 (though
produces some error messages related with the dbus). Also I installed
ndiswrapper as well and the broadcom card seems to be working fine (though I
have some issues with non-broadcasted sid and the wpa-supplicant doesn't seem to
work well, but that's another story).




Comment 24 Steven Haigh 2006-03-02 14:17:21 UTC
ok - this is definatly the bcm43xx kernel module. Booting under a different
kernel version, if you rename bcm43xx.ko to bcm43xx.ko.disabled and reboot into
the latest installed kernel, all works perfectly.

Rename the file back, and the oops occurs every single time.

I leave it now to the kernel hackers among us to decide what do to on this one -
as I'm out of ideas and don't have enough knowledge to fix it.

Comment 25 Richard Körber 2006-03-02 14:28:15 UTC
I have renamed the file using the rescue system, and now I can boot the Dell
Inspiron 510m with FC5test3. Thanks!

Comment 26 Ioannis 2006-03-02 18:43:14 UTC
indeed! that works for me as well (latitude D800 and FC5 Test3 with
2.6.15-1.1996 kernel)

thanks for the tip!

Comment 27 Stephen John Smoogen 2006-03-03 23:39:50 UTC
*** Bug 183064 has been marked as a duplicate of this bug. ***

Comment 28 Stephen John Smoogen 2006-03-04 22:19:11 UTC
On several Dell systems you can get around this bug by turning off the MiniPCI
in the BIOS. 

Comment 29 Steven Haigh 2006-03-04 22:24:49 UTC
Although then you lose functionality of what's in the miniPCI slot - usually a
wifi card. I wouldn't call it a workaround - more a 'disabling the affected
hardware'

Comment 30 Stephen John Smoogen 2006-03-05 01:33:39 UTC
Well the wifi card will be the one that is crashing the system. It as much a
work-around as removing the bcm43xx.ko module. 

Comment 31 Steven Haigh 2006-03-05 01:41:16 UTC
The difference being is that with only the module removed, you can still use
ndiswrapper or similar to use the functionality of the wifi card. Disabling the
hardware makes the card unusable.

Comment 32 Stephen John Smoogen 2006-03-05 02:40:55 UTC
That is correct.. but if this problem comes up for the person who cant get a
rescue disk going for some reason.. turning off the minipci temporarily is a fix
for them to get into the system, remove the offending module if they needed to
or just get the laptop to work. 

Comment 33 Dave Jones 2006-03-06 16:45:50 UTC
does this happen even if you haven't installed the firmware?

If so, we'll probably have to chop out the pci module table that makes it
autoload for FC5, and make people who want to test with it modprobe it by hand.


Comment 34 Steven Haigh 2006-03-06 16:53:32 UTC
I don't understand what you mean here Dave. I have never installed any firmware
for the wifi card. The best I have done is to get the card working with
ndiswrapper by copying across the 2 files from the windows driver.

Are you able to clarify what you mean?

Comment 35 John W. Linville 2006-03-06 18:13:10 UTC
What is the latest kernel version which you have tried?  There was a bcm43xx  
update that went into rawhide on Friday.  
 
Regarding comment 34, the bcm43xx driver requires you to use the fwcutter 
routine to retrieve the firmware from a working windows (or other) driver. 
 
   http://bcm43xx.berlios.de/ 

Comment 36 Steven Haigh 2006-03-06 18:56:00 UTC
The latest kernel I have tried is kernel-2.6.15-1.2009.4.2_FC5.

I haven't done anything with fwcutter, and it doesn't look like there is an RPM
with it in there - so it looks like I'll have to get it from source and see what
I can do.

Before finishing this post, I've now extracted the firmware from bcmw15.sys
driver I was using with ndiswrapper. I've placed the files in /lib/firmware,
however the Oops on booting any new kernel still occurs.

Latest Oops photo to follow (too late to copy out the text at 6am!)


Comment 37 Steven Haigh 2006-03-06 19:00:28 UTC
Created attachment 125715 [details]
Latest kernel oops on kernel-2.6.15-1.2009.4.2_FC5.

Comment 38 Stephen John Smoogen 2006-03-07 14:51:17 UTC
Dave,

The problem occurs with a clean install and updates to the latest modules. I
never got to the point of getting firmware for the system. I was wondering which
version of the kernel module you were using.. one seemed to require the firmware
and one seemed not to (or uses it in a different way).

Comment 39 Steven Haigh 2006-03-09 18:17:00 UTC
I can confirm that the latest kernel update (kernel-2.6.15-1.2032_FC5) boots
correctly with the bcm43xx driver removed.

I suggest that support for this continues to be disabled until related tools
such as the firmware cutter is available as a package and worked into the boot
sequence somehow. At this stage, it's probably not a good idea to re-enable this
on the default kernels until the bcm43xx driver is fully stable and can be fully
operational (or at least not crash the system) on a new RPM install of the kernel.

Comment 40 John W. Linville 2006-03-09 19:11:11 UTC
I don't see a good reason to leave this hanging around until then.  I'm going 
to close this now under the (hopefuly) presumption that when the bcm43xx 
driver comes back, it won't be causing these problems. 


Note You need to log in before you can comment on or make changes to this bug.