Bug 1185518

Summary: oops ipw2100_down on every poweroff/reboot
Product: [Fedora] Fedora Reporter: Chris Murphy <bugzilla>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: dcbw, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-11 21:24:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
photo1.jpg
none
photo2.jpg
none
no debug dmesg.txt
none
debug_dmesg.txt
none
lspci -vvnn
none
lsmod
none
dmesg oops rmmod ipw2100 none

Description Chris Murphy 2015-01-24 05:38:18 UTC
Description of problem: System panics on every reboot or poweroff. This is a regression in 3.19rc5, it doesn't happen in prior 3.19 rcs or 3.18.3.


Version-Release number of selected component (if applicable):
3.19rc5


How reproducible:
Always

Steps to Reproduce:
1.  restart or power off

Actual results:

oops


Expected results:

nooops

Additional info:

Comment 1 Chris Murphy 2015-01-24 05:39:00 UTC
Created attachment 983631 [details]
photo1.jpg

Comment 2 Chris Murphy 2015-01-24 05:39:14 UTC
Created attachment 983632 [details]
photo2.jpg

Comment 3 Chris Murphy 2015-01-24 20:56:30 UTC
Created attachment 983787 [details]
no debug dmesg.txt

Full dmesg attached. This a snippet.

[  248.200777] BUG: unable to handle kernel paging request at 2faeb800
[  248.200876] IP: [<c046bce2>] cancel_delayed_work+0x72/0xb0
[  248.200954] *pde = 00000000 
[  248.200994] Oops: 0000 [#1] SMP 
[  248.201043] Modules linked in: ##snipped
[  248.201128] CPU: 0 PID: 1301 Comm: reboot Not tainted 3.19.0-0.rc5.git0.1.fc22.i686 #1
[  248.201128] Hardware name: Dell Computer Corporation Latitude D600                   /0G5152, BIOS A16 06/29/2005
[  248.201128] task: f23ed500 ti: e36d0000 task.ti: e36d0000
[  248.201128] EIP: 0060:[<c046bce2>] EFLAGS: 00010046 CPU: 0
[  248.201128] EIP is at cancel_delayed_work+0x72/0xb0
[  248.201128] EAX: 017d75c7 EBX: f4ed1504 ECX: 00000000 EDX: 2faeb800
[  248.201128] ESI: 00000000 EDI: 00000000 EBP: e36d1df0 ESP: e36d1ddc
[  248.201128]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  248.201128] CR0: 8005003b CR2: 2faeb800 CR3: 2319e000 CR4: 000007d0
[  248.201128] Stack:
[  248.201128]  00000282 13de818f f4ed1154 f4dc9000 00000000 e36d1e24 f81e3782 c0f38f20
[  248.201128]  e36d1e10 c08d4620 00000001 00000000 00000000 00000000 13de818f f4dc9000
[  248.201128]  f4dc9000 f81ef140 e36d1e30 f81e3806 f4dc9064 e36d1e44 c0724c09 f4dc9070
[  248.201128] Call Trace:
[  248.201128]  [<f81e3782>] ipw2100_down+0x192/0x200 [ipw2100]

Comment 4 Chris Murphy 2015-01-24 21:07:01 UTC
Created attachment 983799 [details]
debug_dmesg.txt

3.19.0-0.rc5.git2.1.fc22.i686

There's a lot more here, but I don't know if it's all related.

Comment 5 Stanislaw Gruszka 2015-01-27 14:54:48 UTC
Could you bisect between 3.18 and 3.19-rc5 (https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html) ? If not, I'll provide you kernel with debug patch, which will print more information.

Comment 6 Chris Murphy 2015-01-27 20:15:31 UTC
Interesting. This isn't reproducible with a cleanly installed Fedora 21 Server (completely up to date) and these kernels:
3.19.0-0.rc4.git4.1.fc22.i686
3.19.0-0.rc5.git0.1.fc22.i686
3.19.0-0.rc5.git1.1.fc22.i686
3.19.0-0.rc5.git2.1.fc22.i686

It must be something in Fedora 22 Server that's triggering the problem, maybe newer included wireless firmware, or systemd, or NetworkManager. I'll do a clean Fedora 22 Server installation and see if this reproduces and report back.

Comment 7 Chris Murphy 2015-01-27 20:41:13 UTC
The boot.iso in koji for 20150124 has a gtk-anaconda bug that prevents me from installing. However, that boot.iso has 3.19.0-0.rc5.git2.1.fc22.i686 and when I reboot from the USB stick I do not get an oops. So somehow the fedup upgrade got this into a rather particular state.

Comment 8 Stanislaw Gruszka 2015-01-29 15:34:57 UTC
If you remove device using sysfs, like in the below example (this is for 03:00.0 device, check lspci for proper device number):

echo 1 > /sys/bus/pci/devices/0000\:03\:00.0/remove

Does the panic occures on reboot ?

Comment 9 Chris Murphy 2015-02-08 20:21:11 UTC
(In reply to Stanislaw Gruszka from comment #8)
Due to installer and fedup issues, I wasn't able to test the system in the state in which the problem occurs.

With a clean installation using Fedora-Live-LXDE-i686-rawhide-20150207.iso, which has kkernel-3.19.0-0.rc7.git2.1.fc22, the problem isn't reproducible. I don't know if the regression got fixed in rc7, or if the fedup 21>Rawhide upgrade put the system in some bad state.

Comment 10 Chris Murphy 2015-05-09 04:02:48 UTC
(In reply to Stanislaw Gruszka from comment #8)
> If you remove device using sysfs, like in the below example (this is for
> 03:00.0 device, check lspci for proper device number):
> 
> echo 1 > /sys/bus/pci/devices/0000\:03\:00.0/remove

When I do that, I get an immediate kernel panic.

 
> Does the panic occures on reboot ?

No opportunity to reboot, already panicked.



dmesg during boot shows this:
[   13.712253] ipw2100: Detected Intel PRO/Wireless 2100 Network Connection
[   14.049121] ipw2100 0000:02:03.0: Direct firmware load for ipw2100-1.3.fw failed with error -2
[   14.049135] ipw2100: eth%d: Firmware 'ipw2100-1.3.fw' not available or load failed.
[   14.049137] ipw2100: eth%d: ipw2100_get_firmware failed: -2
[   14.049140] ipw2100: eth%d: Failed to power on the adapter.
[   14.049142] ipw2100: eth%d: Failed to start the firmware.
[   14.049581] ipw2100 0000:02:03.0: Driver probe function unexpectedly returned 1



This is a new installation today of Fedora 22 Server i386 (final TC3).
Linux version 4.0.1-300.fc22.i686 (mockbuild.fedoraproject.org) (gcc version 5.1.1 20150422 (Red Hat 5.1.1-1) (GCC) ) #1 SMP Wed Apr 29 16:21:41 UTC 2015

Comment 11 Chris Murphy 2015-05-09 04:05:35 UTC
Created attachment 1023679 [details]
lspci -vvnn

Command used in comment 11 to get the panic was
[root@f22s ~]# echo 1 > /sys/bus/pci/devices/0000\:02\:03.0/remove

Comment 12 Chris Murphy 2015-05-09 04:20:26 UTC
Created attachment 1023680 [details]
lsmod

Comment 13 Chris Murphy 2015-05-09 04:22:28 UTC
Created attachment 1023681 [details]
dmesg oops rmmod ipw2100

[root@f22s ~]# rmmod ipw2100

And I get a different panic.

Comment 14 Chris Murphy 2015-05-09 07:53:01 UTC
Booting with parameter modprobe.blacklist=ipw2100 is a workaround, the problem doesn't happen. Of course, no wireless then either.

Comment 15 Dan Williams 2017-07-12 16:50:40 UTC
Honestly don't know why I'm doing this, but here goes...

When firmware load fails, the error path frees the created netdev in ipw2100_pci_init_one() via free_libipw().  Then when ipw2100_pci_remove_one() gets run, it calls unregister_netdev() on that pointer, which obviously is not going to work well.

This is all because ipw2100_up() returns positive values on error.  It gets called from ipw2100_pci_init_one() and that positive error code gets returned to the PCI bus functions.  But those expect negative numbers to indicate errors, and you get the warning "Driver probe function unexpectedly returned 1" because of this.

And since the PCI init didn't return a recongized error value, you'll get the PCI remove functions called, which then cause the panic.

Patch: http://marc.info/?l=linux-wireless&m=149987805219861&w=2

----

Also, do you have the ipw2100-firmware package installed?  Is /lib/firmware/ipw2100-1.3.fw present?

Comment 16 Chris Murphy 2017-07-12 17:02:06 UTC
No idea, I don't have this hardware anymore for quite some time.

Comment 17 Dan Williams 2017-07-12 17:04:40 UTC
Also I'd note this bug isn't i686 specific at all, it just happens that almost everyone with an ipw2100 is running older i686 stuff due to the age of the 2100 and the fact that it's miniPCI only.