Created attachment 594262 [details] picture taken with mobile phone, quality so-so Description of problem: system freeze and switch back to console. Alt-PrtScr keys not working anymore. Power cycling is the only solution. Version-Release number of selected component (if applicable): Linux tatooine.example.org 3.4.3-1.fc17.x86_64 #1 SMP Mon Jun 18 19:53:17 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Network connectivity through the WIFI only, no ethernet cable present How reproducible: once every few days, but highly problematic Steps to Reproduce: 1. Have a moderate network activity: open a web site, launch a chat client, do "yum update", ... 2. 3. Actual results: Total freeze Expected results: Normal working Additional info: Googled a bit on the topic; set the BIOS boot order to "network first", created /etc/modprobe.d/ath9k.conf with content # see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/951709 options ath9k nohwcrypt=1 But this does not solve the problem. The wifi appears in LSPCI as 07:00.0 Network controller: Atheros Communications Inc. AR9485 Wireless Network Adapter (rev 01) Subsystem: Lite-On Communications Inc Device 6617 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at f0100000 (64-bit, non-prefetchable) [size=512K] Expansion ROM at f0500000 [disabled] [size=64K] Capabilities: [40] Power Management version 2 Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00 Kernel driver in use: ath9k A picture of the screen during a hang is added. The interesting process is ath9k_ioread32 (the machine use a x86_64 kernel)
Created attachment 594598 [details] another picutre
Created attachment 594599 [details] same occurence
The message I've seen today is: kernel bug at drivers/net/wireless/ath/ath9k/recv.c: 671 invalid opcode: 000 [#1] SMP
Created attachment 594865 [details] page fault in ath_rx_tasklet
The hang of the day: the system worked flawlessly for around two hours, then I got a page fault. The kernel trace can be seen in attachment https://bugzilla.redhat.com/attachment.cgi?id=594865 Interesting point: ath_rx_tasklet +0x165/0x1b00 followed by page_fault
A new though: I'm using the laptop in a residential area in France. Doing "iwlist scan" reveals there are between 45 and 65 cells. Most of them comes from "boxes", i.e. Internet access point through telephone cable or optical fibers; yet the link with the user computer/laptop/smartphone is through Wifi. TV channels are also availables through those boxes; you can imagine the bandwith. Is there some issue with the number of beacons or link quality which is not handled properly ?
patch for fix this: http://article.gmane.org/gmane.linux.kernel.wireless.general/93723/match=commit+3a2923e83c
Pascal, none of the picture you are posting are useful. Please pan out enough to actually see the entire screen. Alex, on what basis do you believe that fix to apply to this problem?
In any case...test kernels w/ the above mentioned patch are building here: http://koji.fedoraproject.org/koji/taskinfo?taskID=4206016 When they finish building, please give them a try and post the results here...thanks!
uname -r 3.4.4-3.bz832927.1.fc17.x86_64 Testing the assembly. Copy a large file via scp.
So far so good. Do not panic. screenshot: http://storage6.static.itmages.ru/i/12/0628/h_1340910789_9343523_b77e49aa36.png
It works fine! More kernel does not panic.
Installed the new kernel from koji. Removed all the work around, rebooted and ... not a single problem since two hours. Yet I played music from youtube, stress-tested the machine by processing a 10 Gig compressed archive, and so on. Not a single trouble. Congrat for killing this bug.
Wait a minut. The behaviour I observed with the previous kernel was a leakage. If you look a bit around recv.c, line 685, in the lastest kernel: 685 if (ret == -EINVAL) { 686 /* corrupt descriptor, skip this one and the following one */ 687 list_add_tail(&bf->list, &sc->rx.rxbuf); 688 ath_rx_edma_buf_link(sc, qtype); 689 690 skb = skb_peek(&rx_edma->rx_fifo); 691 if (skb) { 692 bf = SKB_CB_ATHBUF(skb); 693 BUG_ON(!bf); 694 695 __skb_unlink(skb, &rx_edma->rx_fifo); 696 list_add_tail(&bf->list, &sc->rx.rxbuf); 697 ath_rx_edma_buf_link(sc, qtype); 698 } else { 699 bf = NULL; 700 } 701 } The idea is to remove the "else {" and the next "}". According to the code,two descriptors are skipped. Is there dynamic memory allocated through those decriptors? Is this memory freed before setting bf to NULL ? Regards Pascal
*** This bug has been marked as a duplicate of bug 832927 ***