Description of problem: System panics on every reboot or poweroff. This is a regression in 3.19rc5, it doesn't happen in prior 3.19 rcs or 3.18.3. Version-Release number of selected component (if applicable): 3.19rc5 How reproducible: Always Steps to Reproduce: 1. restart or power off Actual results: oops Expected results: nooops Additional info:
Created attachment 983631 [details] photo1.jpg
Created attachment 983632 [details] photo2.jpg
Created attachment 983787 [details] no debug dmesg.txt Full dmesg attached. This a snippet. [ 248.200777] BUG: unable to handle kernel paging request at 2faeb800 [ 248.200876] IP: [<c046bce2>] cancel_delayed_work+0x72/0xb0 [ 248.200954] *pde = 00000000 [ 248.200994] Oops: 0000 [#1] SMP [ 248.201043] Modules linked in: ##snipped [ 248.201128] CPU: 0 PID: 1301 Comm: reboot Not tainted 3.19.0-0.rc5.git0.1.fc22.i686 #1 [ 248.201128] Hardware name: Dell Computer Corporation Latitude D600 /0G5152, BIOS A16 06/29/2005 [ 248.201128] task: f23ed500 ti: e36d0000 task.ti: e36d0000 [ 248.201128] EIP: 0060:[<c046bce2>] EFLAGS: 00010046 CPU: 0 [ 248.201128] EIP is at cancel_delayed_work+0x72/0xb0 [ 248.201128] EAX: 017d75c7 EBX: f4ed1504 ECX: 00000000 EDX: 2faeb800 [ 248.201128] ESI: 00000000 EDI: 00000000 EBP: e36d1df0 ESP: e36d1ddc [ 248.201128] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 248.201128] CR0: 8005003b CR2: 2faeb800 CR3: 2319e000 CR4: 000007d0 [ 248.201128] Stack: [ 248.201128] 00000282 13de818f f4ed1154 f4dc9000 00000000 e36d1e24 f81e3782 c0f38f20 [ 248.201128] e36d1e10 c08d4620 00000001 00000000 00000000 00000000 13de818f f4dc9000 [ 248.201128] f4dc9000 f81ef140 e36d1e30 f81e3806 f4dc9064 e36d1e44 c0724c09 f4dc9070 [ 248.201128] Call Trace: [ 248.201128] [<f81e3782>] ipw2100_down+0x192/0x200 [ipw2100]
Created attachment 983799 [details] debug_dmesg.txt 3.19.0-0.rc5.git2.1.fc22.i686 There's a lot more here, but I don't know if it's all related.
Could you bisect between 3.18 and 3.19-rc5 (https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html) ? If not, I'll provide you kernel with debug patch, which will print more information.
Interesting. This isn't reproducible with a cleanly installed Fedora 21 Server (completely up to date) and these kernels: 3.19.0-0.rc4.git4.1.fc22.i686 3.19.0-0.rc5.git0.1.fc22.i686 3.19.0-0.rc5.git1.1.fc22.i686 3.19.0-0.rc5.git2.1.fc22.i686 It must be something in Fedora 22 Server that's triggering the problem, maybe newer included wireless firmware, or systemd, or NetworkManager. I'll do a clean Fedora 22 Server installation and see if this reproduces and report back.
The boot.iso in koji for 20150124 has a gtk-anaconda bug that prevents me from installing. However, that boot.iso has 3.19.0-0.rc5.git2.1.fc22.i686 and when I reboot from the USB stick I do not get an oops. So somehow the fedup upgrade got this into a rather particular state.
If you remove device using sysfs, like in the below example (this is for 03:00.0 device, check lspci for proper device number): echo 1 > /sys/bus/pci/devices/0000\:03\:00.0/remove Does the panic occures on reboot ?
(In reply to Stanislaw Gruszka from comment #8) Due to installer and fedup issues, I wasn't able to test the system in the state in which the problem occurs. With a clean installation using Fedora-Live-LXDE-i686-rawhide-20150207.iso, which has kkernel-3.19.0-0.rc7.git2.1.fc22, the problem isn't reproducible. I don't know if the regression got fixed in rc7, or if the fedup 21>Rawhide upgrade put the system in some bad state.
(In reply to Stanislaw Gruszka from comment #8) > If you remove device using sysfs, like in the below example (this is for > 03:00.0 device, check lspci for proper device number): > > echo 1 > /sys/bus/pci/devices/0000\:03\:00.0/remove When I do that, I get an immediate kernel panic. > Does the panic occures on reboot ? No opportunity to reboot, already panicked. dmesg during boot shows this: [ 13.712253] ipw2100: Detected Intel PRO/Wireless 2100 Network Connection [ 14.049121] ipw2100 0000:02:03.0: Direct firmware load for ipw2100-1.3.fw failed with error -2 [ 14.049135] ipw2100: eth%d: Firmware 'ipw2100-1.3.fw' not available or load failed. [ 14.049137] ipw2100: eth%d: ipw2100_get_firmware failed: -2 [ 14.049140] ipw2100: eth%d: Failed to power on the adapter. [ 14.049142] ipw2100: eth%d: Failed to start the firmware. [ 14.049581] ipw2100 0000:02:03.0: Driver probe function unexpectedly returned 1 This is a new installation today of Fedora 22 Server i386 (final TC3). Linux version 4.0.1-300.fc22.i686 (mockbuild.fedoraproject.org) (gcc version 5.1.1 20150422 (Red Hat 5.1.1-1) (GCC) ) #1 SMP Wed Apr 29 16:21:41 UTC 2015
Created attachment 1023679 [details] lspci -vvnn Command used in comment 11 to get the panic was [root@f22s ~]# echo 1 > /sys/bus/pci/devices/0000\:02\:03.0/remove
Created attachment 1023680 [details] lsmod
Created attachment 1023681 [details] dmesg oops rmmod ipw2100 [root@f22s ~]# rmmod ipw2100 And I get a different panic.
Booting with parameter modprobe.blacklist=ipw2100 is a workaround, the problem doesn't happen. Of course, no wireless then either.
Honestly don't know why I'm doing this, but here goes... When firmware load fails, the error path frees the created netdev in ipw2100_pci_init_one() via free_libipw(). Then when ipw2100_pci_remove_one() gets run, it calls unregister_netdev() on that pointer, which obviously is not going to work well. This is all because ipw2100_up() returns positive values on error. It gets called from ipw2100_pci_init_one() and that positive error code gets returned to the PCI bus functions. But those expect negative numbers to indicate errors, and you get the warning "Driver probe function unexpectedly returned 1" because of this. And since the PCI init didn't return a recongized error value, you'll get the PCI remove functions called, which then cause the panic. Patch: http://marc.info/?l=linux-wireless&m=149987805219861&w=2 ---- Also, do you have the ipw2100-firmware package installed? Is /lib/firmware/ipw2100-1.3.fw present?
No idea, I don't have this hardware anymore for quite some time.
Also I'd note this bug isn't i686 specific at all, it just happens that almost everyone with an ipw2100 is running older i686 stuff due to the age of the 2100 and the fact that it's miniPCI only.