Bug 601171 - iwl3945 "wrong command queue" on wake-up
Summary: iwl3945 "wrong command queue" on wake-up
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Stanislaw Gruszka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 574146
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-07 11:58 UTC by Stanislaw Gruszka
Modified: 2011-01-21 08:19 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 574146
Environment:
Last Closed: 2011-01-21 08:19:15 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Stanislaw Gruszka 2010-06-07 11:58:15 UTC
+++ This bug was initially created as a clone of Bug #574146 +++

Description of problem:

Every time, after suspending my laptop. Network manager doesn't work. If I uncheck and check the box repeatedly, sometimes it comes back, but it takes awhile. Just this latest time I did a straight ifconfig up and everything came back instantly

This problem persists in 2.6.32.14-127.fc12.x86_64:

WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1174 iwl_tx_cmd_complete+0x65/
0x2b4 [iwlcore]()
Hardware name: 4057W7N
wrong command queue 0, sequence 0x0 readp=0 writep=0

Comment 1 Stanislaw Gruszka 2010-06-07 12:16:15 UTC
James, 

I would like to see if I can reproduce problem locally, what is your hardware (laptop type, output of lspci)? 

Please also attach some more verbose debug messages when bug happens. Here is  description how to do this:
https://bugzilla.redhat.com/show_bug.cgi?id=574380#c8
But you have to change module to iwl3945 and debug to 0x47833fff. So last three steps will be:

[root@dhcp-lab-161 ~]# rmmod iwl3945 iwlcore
[root@dhcp-lab-161 ~]# echo > /var/log/messages
[root@dhcp-lab-161 ~]# modprobe iwl3945 debug=0x47833fff

Comment 2 James M. Leddy 2010-06-07 13:53:08 UTC
I have a thinkpad x301:

00:00.0 Host bridge: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub (rev 07)
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
00:03.0 Communication controller: Intel Corporation Mobile 4 Series Chipset MEI Controller (rev 07)
00:03.3 Serial controller: Intel Corporation Mobile 4 Series Chipset AMT SOL Redirection (rev 07)
00:19.0 Ethernet controller: Intel Corporation 82567LM Gigabit Network Connection (rev 03)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 03)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 03)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)

I'll increase debugging verbosity and let you know what happens. It would appear that this doesn't happen "every time" as I had mentioned earlier.

Comment 3 James M. Leddy 2010-06-07 13:59:34 UTC
On second look, perhaps it is not related to the driver at all, but I get some weird memory allocation errors:

Jun  4 12:30:44 localhost NetworkManager: <info> waking up...
Jun  4 12:30:44 localhost NetworkManager: <info> (eth0): now managed
Jun  4 12:30:44 localhost NetworkManager: <info> (eth0): device state change: 1 -> 2 (reason 2)
Jun  4 12:30:44 localhost NetworkManager: <info> (eth0): bringing up device.
Jun  4 12:30:44 localhost NetworkManager: <info> (eth0): preparing device.
Jun  4 12:30:44 localhost NetworkManager: <info> (eth0): deactivating device (reason: 2).
Jun  4 12:30:44 localhost kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Jun  4 12:30:44 localhost NetworkManager: <info> (wlan0): now managed
Jun  4 12:30:44 localhost NetworkManager: <info> (wlan0): device state change: 1 -> 2 (reason 2)
Jun  4 12:30:44 localhost NetworkManager: <info> (wlan0): bringing up device.
Jun  4 12:30:44 localhost kernel: usb 1-1: New USB device found, idVendor=05e3, idProduct=0606
Jun  4 12:30:44 localhost kernel: usb 1-1: New USB device strings: Mfr=0, Product=1, SerialNumber=0
Jun  4 12:30:44 localhost kernel: usb 1-1: Product: USB2.0 Hub
Jun  4 12:30:44 localhost kernel: usb 1-1: configuration #1 chosen from 1 choice
Jun  4 12:30:44 localhost kernel: hub 1-1:1.0: USB hub found
Jun  4 12:30:44 localhost kernel: hub 1-1:1.0: 4 ports detected
Jun  4 12:30:45 localhost kernel: NetworkManager: page allocation failure. order:4, mode:0x40d0
Jun  4 12:30:45 localhost kernel: Pid: 1405, comm: NetworkManager Not tainted 2.6.32.14-127.fc12.x86_64 #1
Jun  4 12:30:45 localhost kernel: Call Trace:
Jun  4 12:30:45 localhost kernel: [<ffffffff810dcbb4>] __alloc_pages_nodemask+0x5ad/0x631
Jun  4 12:30:45 localhost kernel: [<ffffffff811051c7>] alloc_pages_current+0x95/0x9e
Jun  4 12:30:45 localhost kernel: [<ffffffff810db44e>] __get_free_pages+0xe/0x4b
Jun  4 12:30:45 localhost kernel: [<ffffffffa0273d57>] iwl_tx_queue_init+0x102/0x2bc [iwlcore]
Jun  4 12:30:45 localhost kernel: [<ffffffffa0275da3>] iwl_txq_ctx_reset+0x17e/0x207 [iwlcore]
Jun  4 12:30:45 localhost kernel: [<ffffffffa026a4a4>] iwl_hw_nic_init+0x133/0x148 [iwlcore]
Jun  4 12:30:45 localhost kernel: [<ffffffffa02ca642>] __iwl_up+0x16b/0x3ab [iwlagn]
Jun  4 12:30:45 localhost kernel: [<ffffffffa02cb8c1>] iwl_mac_start+0x103f/0x12bb [iwlagn]
Jun  4 12:30:45 localhost kernel: [<ffffffff813cf71f>] ? nlmsg_notify+0x48/0x8e
Jun  4 12:30:45 localhost kernel: [<ffffffffa01d609a>] ieee80211_open+0x270/0x5c4 [mac80211]
Jun  4 12:30:45 localhost kernel: [<ffffffff814592d6>] ? notifier_call_chain+0x37/0x63
Jun  4 12:30:45 localhost kernel: [<ffffffff813b5bb3>] dev_open+0x9d/0xd8
Jun  4 12:30:45 localhost kernel: [<ffffffff813b530d>] dev_change_flags+0xad/0x16d
Jun  4 12:30:45 localhost kernel: [<ffffffff813be8ce>] do_setlink+0x26c/0x33d
Jun  4 12:30:45 localhost kernel: [<ffffffff811eb3d6>] ? avc_has_perm+0x5c/0x6e
Jun  4 12:30:45 localhost kernel: [<ffffffff813beab2>] rtnl_setlink+0x113/0x126
Jun  4 12:30:45 localhost kernel: [<ffffffff813be319>] rtnetlink_rcv_msg+0x1c6/0x1e3
Jun  4 12:30:45 localhost kernel: [<ffffffff813cfd43>] ? netlink_sendmsg+0x174/0x26b
Jun  4 12:30:45 localhost kernel: [<ffffffff813be153>] ? rtnetlink_rcv_msg+0x0/0x1e3
Jun  4 12:30:45 localhost kernel: [<ffffffff813cf89f>] netlink_rcv_skb+0x43/0x94
Jun  4 12:30:45 localhost kernel: [<ffffffff813be14c>] rtnetlink_rcv+0x26/0x2d
Jun  4 12:30:45 localhost kernel: [<ffffffff813cf66e>] netlink_unicast+0x125/0x18e
Jun  4 12:30:45 localhost kernel: [<ffffffff813cfe2b>] netlink_sendmsg+0x25c/0x26b
Jun  4 12:30:45 localhost kernel: [<ffffffff813a4366>] __sock_sendmsg+0x61/0x6c
Jun  4 12:30:45 localhost kernel: [<ffffffff813a4ac9>] sock_sendmsg+0xcc/0xe5
Jun  4 12:30:45 localhost kernel: [<ffffffff813a499e>] ? sock_recvmsg+0xcf/0xe8
Jun  4 12:30:45 localhost kernel: [<ffffffff810748eb>] ? autoremove_wake_function+0x0/0x39
Jun  4 12:30:45 localhost kernel: [<ffffffff810748eb>] ? autoremove_wake_function+0x0/0x39
Jun  4 12:30:45 localhost kernel: [<ffffffff813a5665>] ? move_addr_to_kernel+0x48/0x4d
Jun  4 12:30:45 localhost kernel: [<ffffffff813adfd7>] ? verify_iovec+0x51/0x8e
Jun  4 12:30:45 localhost kernel: [<ffffffff813a4d03>] sys_sendmsg+0x221/0x2a5
Jun  4 12:30:45 localhost kernel: [<ffffffff8110ba1f>] ? virt_to_head_page+0xe/0x2f
Jun  4 12:30:45 localhost kernel: [<ffffffff813a4423>] ? sockfd_lookup_light+0x20/0x58
Jun  4 12:30:45 localhost kernel: [<ffffffff813a4401>] ? fput_light+0x12/0x14
Jun  4 12:30:45 localhost kernel: [<ffffffff813a578f>] ? sys_sendto+0x125/0x152
Jun  4 12:30:45 localhost kernel: [<ffffffff8111cb18>] ? __fput+0x1cd/0x1dc
Jun  4 12:30:45 localhost kernel: [<ffffffff8111cb41>] ? fput+0x1a/0x1c
Jun  4 12:30:45 localhost kernel: [<ffffffff81118fd5>] ? filp_close+0x68/0x72
Jun  4 12:30:45 localhost kernel: [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b

Followed by:
Jun  4 12:30:45 localhost kernel: Mem-Info:
Jun  4 12:30:45 localhost kernel: Node 0 DMA per-cpu:
Jun  4 12:30:45 localhost kernel: CPU    0: hi:    0, btch:   1 usd:   0
Jun  4 12:30:45 localhost kernel: CPU    1: hi:    0, btch:   1 usd:   0
Jun  4 12:30:45 localhost kernel: Node 0 DMA32 per-cpu:
Jun  4 12:30:45 localhost kernel: CPU    0: hi:  186, btch:  31 usd:   0
Jun  4 12:30:45 localhost kernel: CPU    1: hi:  186, btch:  31 usd:   0
Jun  4 12:30:45 localhost kernel: active_anon:146746 inactive_anon:61877 isolated_anon:0
Jun  4 12:30:45 localhost kernel: active_file:96881 inactive_file:99277 isolated_file:0
Jun  4 12:30:45 localhost kernel: unevictable:4 dirty:3085 writeback:0 unstable:0
Jun  4 12:30:45 localhost kernel: free:25255 slab_reclaimable:16316 slab_unreclaimable:14534
Jun  4 12:30:45 localhost kernel: mapped:13561 shmem:18761 pagetables:6577 bounce:0
Jun  4 12:30:45 localhost kernel: Node 0 DMA free:7652kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:36kB active_file:4908kB inactive_file:2524kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15364kB mlocked:0kB dirty:0kB writeback:0kB mapped:164kB shmem:0kB slab_reclaimable:532kB slab_unreclaimable:116kB kernel_stack:8kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jun  4 12:30:45 localhost kernel: lowmem_reserve[]: 0 1902 1902 1902
Jun  4 12:30:45 localhost kernel: Node 0 DMA32 free:93368kB min:5560kB low:6948kB high:8340kB active_anon:586984kB inactive_anon:247472kB active_file:382616kB inactive_file:394584kB unevictable:16kB isolated(anon):0kB isolated(file):0kB present:1947648kB mlocked:16kB dirty:12340kB writeback:0kB mapped:54080kB shmem:75044kB slab_reclaimable:64732kB slab_unreclaimable:58020kB kernel_stack:2416kB pagetables:26308kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jun  4 12:30:45 localhost kernel: lowmem_reserve[]: 0 0 0 0
Jun  4 12:30:45 localhost kernel: Node 0 DMA: 13*4kB 44*8kB 47*16kB 39*32kB 22*64kB 6*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 7652kB
Jun  4 12:30:45 localhost kernel: Node 0 DMA32: 17444*4kB 2783*8kB 3*16kB 2*32kB 1*64kB 1*128kB 2*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 93368kB
Jun  4 12:30:45 localhost kernel: 227203 total pagecache pages
Jun  4 12:30:45 localhost kernel: 12288 pages in swap cache
Jun  4 12:30:45 localhost kernel: Swap cache stats: add 241419, delete 229131, find 590014/608642
Jun  4 12:30:45 localhost kernel: Free swap  = 1852548kB
Jun  4 12:30:45 localhost kernel: Total swap = 2097144kB
Jun  4 12:30:45 localhost kernel: 498688 pages RAM
Jun  4 12:30:45 localhost kernel: 10814 pages reserved
Jun  4 12:30:45 localhost kernel: 249196 pages shared
Jun  4 12:30:45 localhost kernel: 260430 pages non-shared


And then the "wrong commmand queue" panic string from the driver:

Jun  4 12:30:45 localhost kernel: iwlagn 0000:03:00.0: kmalloc for auxiliary BD structures failed
Jun  4 12:30:45 localhost kernel: iwlagn 0000:03:00.0: Tx 11 queue init failed
Jun  4 12:30:45 localhost kernel: iwlagn 0000:03:00.0: Unable to init nic
Jun  4 12:30:45 localhost NetworkManager: <info> (wlan0): deactivating device (reason: 2).
Jun  4 12:30:45 localhost kernel: ------------[ cut here ]------------
Jun  4 12:30:45 localhost kernel: WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1174 iwl_tx_cmd_complete+0x65/0x2b4 [iwlcore]()
Jun  4 12:30:45 localhost kernel: Hardware name: 4057W7N
Jun  4 12:30:45 localhost kernel: wrong command queue 0, sequence 0x0 readp=0 writep=0
Jun  4 12:30:45 localhost kernel: Modules linked in: ppp_deflate zlib_deflate ppp_async crc_ccitt ppp_generic slhc rndis_wlan cdc_phonet cdc_acm rndis_host cdc_ether phonet usbnet mii tun cryptd aes_x86_64 aes_generic fuse hidp rfcomm sco bridge stp llc bnep l2cap sunrpc cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput snd_hda_codec_conexant arc4 snd_hda_intel ecb iwlagn snd_hda_codec iwlcore snd_hwdep snd_seq snd_seq_device snd_pcm uvcvideo mac80211 videodev btusb snd_timer v4l1_compat v4l2_compat_ioctl32 bluetooth joydev thinkpad_acpi i2c_i801 cfg80211 iTCO_wdt snd e1000e iTCO_vendor_support soundcore wmi rfkill snd_page_alloc nilfs2 i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: microcode]
Jun  4 12:30:45 localhost kernel: Pid: 886, comm: nilfs_cleanerd Not tainted 2.6.32.14-127.fc12.x86_64 #1
Jun  4 12:30:45 localhost kernel: Call Trace:
Jun  4 12:30:45 localhost kernel: <IRQ>  [<ffffffff81056364>] warn_slowpath_common+0x7c/0x94
Jun  4 12:30:45 localhost kernel: [<ffffffff810563d3>] warn_slowpath_fmt+0x41/0x43
Jun  4 12:30:45 localhost kernel: [<ffffffff81216c48>] ? blk_run_queue+0x35/0x3a
Jun  4 12:30:45 localhost kernel: [<ffffffffa0272c5e>] iwl_tx_cmd_complete+0x65/0x2b4 [iwlcore]
Jun  4 12:30:45 localhost kernel: [<ffffffff812e9a9f>] ? __scsi_put_command+0x70/0x7d
Jun  4 12:30:45 localhost kernel: [<ffffffffa02c6770>] iwl_rx_handle+0x2e4/0x3c5 [iwlagn]
Jun  4 12:30:45 localhost kernel: [<ffffffff812efd7c>] ? scsi_io_completion+0x203/0x42f
Jun  4 12:30:45 localhost kernel: [<ffffffff81304da9>] ? ata_qc_complete_multiple+0xc7/0xcc
Jun  4 12:30:45 localhost kernel: [<ffffffffa02c9091>] iwl_irq_tasklet+0x592/0x729 [iwlagn]
Jun  4 12:30:45 localhost kernel: [<ffffffff8105bfa3>] tasklet_action+0x85/0xe4
Jun  4 12:30:45 localhost kernel: [<ffffffff8105d98c>] __do_softirq+0xe5/0x1a9
Jun  4 12:30:45 localhost kernel: [<ffffffff810acfd5>] ? handle_IRQ_event+0x60/0x121
Jun  4 12:30:45 localhost kernel: [<ffffffff81012e6c>] call_softirq+0x1c/0x30
Jun  4 12:30:45 localhost kernel: [<ffffffff810143ea>] do_softirq+0x46/0x86
Jun  4 12:30:45 localhost kernel: [<ffffffff8105d7ca>] irq_exit+0x3b/0x7d
Jun  4 12:30:45 localhost kernel: [<ffffffff8145adbd>] do_IRQ+0xa5/0xbc
Jun  4 12:30:45 localhost kernel: [<ffffffff81012693>] ret_from_intr+0x0/0x11
Jun  4 12:30:45 localhost kernel: <EOI>  [<ffffffffa00dd74b>] ? nilfs_palloc_bitmap_blkoff+0xf/0x3f [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00dd901>] ? nilfs_palloc_get_entry_block+0x2e/0x61 [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00dd81d>] ? nilfs_palloc_block_get_entry+0x24/0x58 [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00d3644>] ? nilfs_dat_move+0x28/0xfb [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00cf8e8>] ? nilfs_bmap_data_get_key+0x10/0x53 [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00cffb8>] ? nilfs_btree_assign_gc+0x3c/0x8b [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00cf778>] ? nilfs_bmap_assign+0x42/0x59 [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00d907e>] ? nilfs_segctor_do_construct+0xe39/0x191b [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00d9d3d>] ? nilfs_segctor_construct+0x41/0xbb [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00da8ec>] ? nilfs_clean_segments+0x105/0x253 [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffff810ddf36>] ? __set_page_dirty_nobuffers+0xb0/0xd7
Jun  4 12:30:45 localhost kernel: [<ffffffffa00df0d4>] ? nilfs_ioctl_clean_segments+0x483/0x4cc [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffffa00df371>] ? nilfs_ioctl+0x254/0x2ba [nilfs2]
Jun  4 12:30:45 localhost kernel: [<ffffffff8112a5d8>] ? vfs_ioctl+0x22/0x87
Jun  4 12:30:45 localhost kernel: [<ffffffff8112ab34>] ? do_vfs_ioctl+0x47b/0x4c1
Jun  4 12:30:45 localhost kernel: [<ffffffff8112abd0>] ? sys_ioctl+0x56/0x79
Jun  4 12:30:45 localhost kernel: [<ffffffff81011d32>] ? system_call_fastpath+0x16/0x1b
Jun  4 12:30:45 localhost kernel: ---[ end trace b59518df5019d83c ]---
Jun  4 12:30:45 localhost kernel: iwl data: 00000000: 58 d0 04 32 7b 7f 00 00 00 75 a6 3f 7b 7f 00 00  X..2{....u.?{...
Jun  4 12:30:45 localhost kernel: iwl data: 00000010: d8 4a b1 4c 7b 7f 00 00 04 00 00 00 00 00 00 00  .J.L{...........

Comment 4 Stanislaw Gruszka 2010-06-07 14:32:54 UTC
(In reply to comment #3)
> On second look, perhaps it is not related to the driver at all, but I get some
> weird memory allocation errors:
[snip]
> Jun  4 12:30:45 localhost kernel: [<ffffffffa0273d57>]
> iwl_tx_queue_init+0x102/0x2bc [iwlcore]
> Jun  4 12:30:45 localhost kernel: [<ffffffffa0275da3>]
> iwl_txq_ctx_reset+0x17e/0x207 [iwlcore]
> Jun  4 12:30:45 localhost kernel: [<ffffffffa026a4a4>]
> iwl_hw_nic_init+0x133/0x148 [iwlcore]
> Jun  4 12:30:45 localhost kernel: [<ffffffffa02ca642>] __iwl_up+0x16b/0x3ab
> [iwlagn]

Grrr.

I saw upstream patch with fix for this exact allocation failure. I will apply it and give you kernel to test. And yes, it is driver problem, however it is also allocator problem (I don't know why we are still using SLUB in fedora whereas we switch to SLAB in RHEL6).

Previous  "WARNING: wrong command queue 0" problem is related with allocation failures or not?

> I'll increase debugging verbosity and let you know what happens. It would
> appear that this doesn't happen "every time" as I had mentioned earlier.    

Ahh, I though it is 100% reproducible, running all the time with verbose debug enabled is problematical ...

Comment 5 James M. Leddy 2010-06-07 15:07:24 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > On second look, perhaps it is not related to the driver at all, but I get some
> > weird memory allocation errors:
> [snip]
> > Jun  4 12:30:45 localhost kernel: [<ffffffffa0273d57>]
> > iwl_tx_queue_init+0x102/0x2bc [iwlcore]
> > Jun  4 12:30:45 localhost kernel: [<ffffffffa0275da3>]
> > iwl_txq_ctx_reset+0x17e/0x207 [iwlcore]
> > Jun  4 12:30:45 localhost kernel: [<ffffffffa026a4a4>]
> > iwl_hw_nic_init+0x133/0x148 [iwlcore]
> > Jun  4 12:30:45 localhost kernel: [<ffffffffa02ca642>] __iwl_up+0x16b/0x3ab
> > [iwlagn]
> 
> Grrr.
> 
> I saw upstream patch with fix for this exact allocation failure. I will apply
> it and give you kernel to test. 

Hmm, didn't we do that for bug 574146? It appears to still be occurring, though perhaps less frequently.

> Previous  "WARNING: wrong command queue 0" problem is related with allocation
> failures or not?

Yes, as far as I've seen, every time this has happened has been two stack traces, one from NM with a page, then the kmalloc failure, then the wrong command queue.

Comment 6 Stanislaw Gruszka 2010-06-08 09:41:44 UTC
> > I saw upstream patch with fix for this exact allocation failure. I will apply
> > it and give you kernel to test. 
> 
> Hmm, didn't we do that for bug 574146? It appears to still be occurring, though perhaps less frequently.

Yes, I thought patch from 574146 it fix issue completely, but it just make problem less probable.

> > Previous  "WARNING: wrong command queue 0" problem is related with allocation
> > failures or not?
> 
> Yes, as far as I've seen, every time this has happened has been two stack
> traces, one from NM with a page, then the kmalloc failure, then the wrong
> command queue.    

Ok, that mean we have to cope with allocation failures. Hence verbose debugging i requested in comment 1 is not needed.

I want to test commit:

commit 470058e0ad82fcfaaffd57307d8bf8c094e8e9d7
Author: Zhu Yi <yi.zhu>
Date:   Fri Apr 2 13:38:54 2010 -0700

    iwlwifi: avoid Tx queue memory allocation in interface down

I'm going to prepare kernel with that patch.

Comment 7 Stanislaw Gruszka 2010-06-09 13:14:37 UTC
Please check out build:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2239190

Comment 8 James M. Leddy 2010-06-16 14:10:56 UTC
Stanislaw,

Thanks for the kernel build. This seems to be happening relatively frequently. I'll let you know if the problem subsides.

Comment 9 James M. Leddy 2010-06-16 14:33:45 UTC
That commit is in Linus? I got a npe:


Jun 16 10:14:01 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
Jun 16 10:14:01 localhost kernel: IP: [<ffffffffa01f4e72>] iwl_tx_queue_reset+0x2f/0x58 [iwlcore]
Jun 16 10:14:01 localhost kernel: PGD f5bc067 PUD f5bd067 PMD 0 
Jun 16 10:14:01 localhost kernel: Oops: 0002 [#1] SMP 
Jun 16 10:14:01 localhost kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/firmware/0000:03:00.0/loading
Jun 16 10:14:01 localhost kernel: CPU 1 
Jun 16 10:14:01 localhost kernel: Modules linked in: sco bridge stp llc bnep l2cap sunrpc cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput snd_hda_codec_conexant arc4 ecb snd_hda_intel snd_hda_codec uvcvideo iwlagn snd_hwdep snd_seq snd_seq_device videodev v4l1_compat btusb snd_pcm v4l2_compat_ioctl32 iwlcore bluetooth joydev mac80211 snd_timer iTCO_wdt i2c_i801 iTCO_vendor_support cfg80211 e1000e thinkpad_acpi snd soundcore snd_page_alloc rfkill wmi nilfs2 i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: microcode]
Jun 16 10:14:01 localhost kernel: Pid: 1457, comm: NetworkManager Not tainted 2.6.32.14-130.bz601171.fc12.x86_64 #1 4057W7N
Jun 16 10:14:01 localhost kernel: RIP: 0010:[<ffffffffa01f4e72>]  [<ffffffffa01f4e72>] iwl_tx_queue_reset+0x2f/0x58 [iwlcore]
Jun 16 10:14:01 localhost kernel: RSP: 0018:ffff88000f5db668  EFLAGS: 00010246
Jun 16 10:14:01 localhost kernel: RAX: 0000000000000000 RBX: ffff880069d25e78 RCX: 0000000000002800
Jun 16 10:14:01 localhost kernel: RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000000
Jun 16 10:14:01 localhost kernel: RBP: ffff88000f5db678 R08: 0000000000000000 R09: ffff880069d212c0
Jun 16 10:14:01 localhost kernel: R10: ffff880069d21d1c R11: ffff880069d212c0 R12: ffff880069d212c0
Jun 16 10:14:01 localhost kernel: R13: 0000000000000100 R14: 0000000000000020 R15: 00000000000563d8
Jun 16 10:14:01 localhost kernel: FS:  00007fcd0723a800(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
Jun 16 10:14:01 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 16 10:14:01 localhost kernel: CR2: 0000000000000000 CR3: 000000000f4d9000 CR4: 00000000000406e0
Jun 16 10:14:01 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 16 10:14:01 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 16 10:14:01 localhost kernel: Process NetworkManager (pid: 1457, threadinfo ffff88000f5da000, task ffff88000f352ec0)
Jun 16 10:14:01 localhost kernel: Stack:
Jun 16 10:14:01 localhost kernel: ffff880069d212c0 0000000000000001 ffff88000f5db6a8 ffffffffa01f4f30
Jun 16 10:14:01 localhost kernel: <0> ffff880069d212c0 ffff880069d21d14 ffff880069d22418 0000000000000246
Jun 16 10:14:01 localhost kernel: <0> ffff88000f5db6d8 ffffffffa01e94a4 ffff880069d212c0 000000000000c000
Jun 16 10:14:01 localhost kernel: Call Trace:
Jun 16 10:14:01 localhost kernel: [<ffffffffa01f4f30>] iwlagn_txq_ctx_reset+0x95/0xad [iwlcore]
Jun 16 10:14:01 localhost kernel: [<ffffffffa01e94a4>] iwl_hw_nic_init+0x133/0x146 [iwlcore]
Jun 16 10:14:01 localhost kernel: [<ffffffffa0297642>] __iwl_up+0x16b/0x3ab [iwlagn]
Jun 16 10:14:01 localhost kernel: [<ffffffffa02988c1>] iwl_mac_start+0x103f/0x12bb [iwlagn]
Jun 16 10:14:01 localhost kernel: [<ffffffff813cf71f>] ? nlmsg_notify+0x48/0x8e
Jun 16 10:14:01 localhost kernel: [<ffffffffa018609a>] ieee80211_open+0x270/0x5c4 [mac80211]
Jun 16 10:14:01 localhost kernel: [<ffffffff814592d6>] ? notifier_call_chain+0x37/0x63
Jun 16 10:14:01 localhost kernel: [<ffffffff813b5bb3>] dev_open+0x9d/0xd8
Jun 16 10:14:01 localhost kernel: [<ffffffff813b530d>] dev_change_flags+0xad/0x16d
Jun 16 10:14:01 localhost kernel: [<ffffffff813be8ce>] do_setlink+0x26c/0x33d
Jun 16 10:14:01 localhost kernel: [<ffffffff811eb3d6>] ? avc_has_perm+0x5c/0x6e
Jun 16 10:14:01 localhost kernel: [<ffffffff813beab2>] rtnl_setlink+0x113/0x126
Jun 16 10:14:01 localhost kernel: [<ffffffff813be319>] rtnetlink_rcv_msg+0x1c6/0x1e3
Jun 16 10:14:01 localhost kernel: [<ffffffff813cfd43>] ? netlink_sendmsg+0x174/0x26b
Jun 16 10:14:01 localhost kernel: [<ffffffff813be153>] ? rtnetlink_rcv_msg+0x0/0x1e3
Jun 16 10:14:01 localhost kernel: [<ffffffff813cf89f>] netlink_rcv_skb+0x43/0x94
Jun 16 10:14:01 localhost kernel: [<ffffffff813be14c>] rtnetlink_rcv+0x26/0x2d
Jun 16 10:14:01 localhost kernel: [<ffffffff813cf66e>] netlink_unicast+0x125/0x18e
Jun 16 10:14:01 localhost kernel: [<ffffffff813cfe2b>] netlink_sendmsg+0x25c/0x26b
Jun 16 10:14:01 localhost kernel: [<ffffffff813a4366>] __sock_sendmsg+0x61/0x6c
Jun 16 10:14:01 localhost kernel: [<ffffffff813a4ac9>] sock_sendmsg+0xcc/0xe5
Jun 16 10:14:01 localhost kernel: [<ffffffff813a499e>] ? sock_recvmsg+0xcf/0xe8
Jun 16 10:14:01 localhost kernel: [<ffffffff810748eb>] ? autoremove_wake_function+0x0/0x39
Jun 16 10:14:01 localhost kernel: [<ffffffff810748eb>] ? autoremove_wake_function+0x0/0x39
Jun 16 10:14:01 localhost kernel: [<ffffffff813a5665>] ? move_addr_to_kernel+0x48/0x4d
Jun 16 10:14:01 localhost kernel: [<ffffffff813adfd7>] ? verify_iovec+0x51/0x8e
Jun 16 10:14:01 localhost kernel: [<ffffffff813a4d03>] sys_sendmsg+0x221/0x2a5
Jun 16 10:14:01 localhost kernel: [<ffffffff8110ba1f>] ? virt_to_head_page+0xe/0x2f
Jun 16 10:14:01 localhost kernel: [<ffffffff813a4423>] ? sockfd_lookup_light+0x20/0x58
Jun 16 10:14:01 localhost kernel: [<ffffffff813a578f>] ? sys_sendto+0x125/0x152
Jun 16 10:14:01 localhost kernel: [<ffffffff8111cb18>] ? __fput+0x1cd/0x1dc
Jun 16 10:14:01 localhost kernel: [<ffffffff8111cb41>] ? fput+0x1a/0x1c
Jun 16 10:14:01 localhost kernel: [<ffffffff81118fd5>] ? filp_close+0x68/0x72
Jun 16 10:14:01 localhost kernel: [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b
Jun 16 10:14:01 localhost kernel: Code: 41 54 53 0f 1f 44 00 00 48 89 f3 89 d6 89 ca 8d 4e 01 83 fa 04 4c 8b 43 40 49 89 fc 0f 45 ce 31 c0 48 63 c9 4c 89 c7 48 6b c9 28 <f3> aa 48 89 df c6 43 50 00 e8 e9 fa ff ff 49 8b 44 24 18 48 89 
Jun 16 10:14:01 localhost kernel: RIP  [<ffffffffa01f4e72>] iwl_tx_queue_reset+0x2f/0x58 [iwlcore]
Jun 16 10:14:01 localhost kernel: RSP <ffff88000f5db668>
Jun 16 10:14:01 localhost kernel: CR2: 0000000000000000
Jun 16 10:14:01 localhost kernel: ---[ end trace ed5feef38d4a5817 ]---

Comment 10 Stanislaw Gruszka 2010-06-16 14:57:18 UTC
(In reply to comment #9)
> That commit is in Linus? I got a npe:
> 
> 
> Jun 16 10:14:01 localhost kernel: BUG: unable to handle kernel NULL pointer
> dereference at (null)
> Jun 16 10:14:01 localhost kernel: IP: [<ffffffffa01f4e72>]
> iwl_tx_queue_reset+0x2f/0x58 [iwlcore]

It is in Linus tree, but code changed between 2.6.32 and 2.6.34. Commit may relay on some other changes, so I screw up during backport. I will look at oops and give you new a kernel shortly.

Comment 11 James M. Leddy 2010-06-16 15:06:35 UTC
Stanislaw,

I might just upgrade to F13 to avoid the issue. I'll let you know if the upgrade fixes the problem.

Comment 12 Bug Zapper 2010-11-03 13:29:43 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 13 Stanislaw Gruszka 2010-11-08 10:30:26 UTC
Issue that James has is still in F-13, it should be fixed in F-14.

Comment 14 Stanislaw Gruszka 2011-01-21 08:19:15 UTC
This problem should be fixed in F-14. No plans to backport.


Note You need to log in before you can comment on or make changes to this bug.