Bug 835213 - kernel panic in ath9k driver [Acer Aspire One D722]
kernel panic in ath9k driver [Acer Aspire One D722]
Status: CLOSED DUPLICATE of bug 832927
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: John W. Linville
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2012-06-25 15:31 EDT by Pascal Dupuis
Modified: 2012-06-29 10:40 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-06-29 10:40:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
picture taken with mobile phone, quality so-so (93.69 KB, image/jpeg)
2012-06-25 15:31 EDT, Pascal Dupuis
no flags Details
another picutre (109.20 KB, image/jpeg)
2012-06-26 17:08 EDT, Pascal Dupuis
no flags Details
same occurence (73.79 KB, image/jpeg)
2012-06-26 17:10 EDT, Pascal Dupuis
no flags Details
page fault in ath_rx_tasklet (89.95 KB, image/jpeg)
2012-06-27 17:23 EDT, Pascal Dupuis
no flags Details

  None (edit)
Description Pascal Dupuis 2012-06-25 15:31:39 EDT
Created attachment 594262 [details]
picture taken with mobile phone, quality so-so

Description of problem: system freeze and switch back to console. Alt-PrtScr keys not working anymore. Power cycling is the only solution.

Version-Release number of selected component (if applicable):
Linux tatooine.example.org 3.4.3-1.fc17.x86_64 #1 SMP Mon Jun 18 19:53:17 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Network connectivity through the WIFI only, no ethernet cable present

How reproducible: once every few days, but highly problematic

Steps to Reproduce:
1. Have a moderate network activity: open a web site, launch a chat client, do "yum update", ...
Actual results: 
Total freeze

Expected results:
Normal working

Additional info:
Googled a bit on the topic; set the BIOS boot order to "network first", created /etc/modprobe.d/ath9k.conf with content
# see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/951709

options ath9k nohwcrypt=1

But this does not solve the problem. The wifi appears in LSPCI as
07:00.0 Network controller: Atheros Communications Inc. AR9485 Wireless Network Adapter (rev 01)
	Subsystem: Lite-On Communications Inc Device 6617
	Flags: bus master, fast devsel, latency 0, IRQ 19
	Memory at f0100000 (64-bit, non-prefetchable) [size=512K]
	Expansion ROM at f0500000 [disabled] [size=64K]
	Capabilities: [40] Power Management version 2
	Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
	Kernel driver in use: ath9k

A picture of the screen during a hang is added. The interesting process is ath9k_ioread32 (the machine use a x86_64 kernel)
Comment 1 Pascal Dupuis 2012-06-26 17:08:37 EDT
Created attachment 594598 [details]
another picutre
Comment 2 Pascal Dupuis 2012-06-26 17:10:41 EDT
Created attachment 594599 [details]
same occurence
Comment 3 Pascal Dupuis 2012-06-26 17:12:25 EDT
The message I've seen today is:

kernel bug at drivers/net/wireless/ath/ath9k/recv.c: 671
invalid opcode: 000 [#1] SMP
Comment 4 Pascal Dupuis 2012-06-27 17:23:20 EDT
Created attachment 594865 [details]
page fault in ath_rx_tasklet
Comment 5 Pascal Dupuis 2012-06-27 17:25:35 EDT
The hang of the day: the system worked flawlessly for around two hours, then I got a page fault. The kernel trace can be seen in attachment https://bugzilla.redhat.com/attachment.cgi?id=594865

Interesting point:

ath_rx_tasklet +0x165/0x1b00
followed by page_fault
Comment 6 Pascal Dupuis 2012-06-28 03:38:55 EDT
A new though: I'm using the laptop in a residential area in France. Doing "iwlist scan" reveals there are between 45 and 65 cells. Most of them comes from "boxes", i.e. Internet access point through telephone cable or optical fibers; yet the link with the user computer/laptop/smartphone is through Wifi. TV channels are also availables through those boxes; you can imagine the bandwith.

Is there some issue with the number of beacons or link quality which is not handled properly ?
Comment 8 John W. Linville 2012-06-28 11:53:23 EDT
Pascal, none of the picture you are posting are useful.  Please pan out enough to actually see the entire screen.

Alex, on what basis do you believe that fix to apply to this problem?
Comment 9 John W. Linville 2012-06-28 13:25:56 EDT
In any case...test kernels w/ the above mentioned patch are building here:


When they finish building, please give them a try and post the results here...thanks!
Comment 10 Alex Andilevko 2012-06-28 15:02:41 EDT
 uname -r

Testing the assembly. Copy a large file via scp.
Comment 11 Alex Andilevko 2012-06-28 15:15:00 EDT
So far so good. Do not panic. screenshot: http://storage6.static.itmages.ru/i/12/0628/h_1340910789_9343523_b77e49aa36.png
Comment 12 Alex Andilevko 2012-06-28 16:21:29 EDT
It works fine! More kernel does not panic.
Comment 13 Pascal Dupuis 2012-06-28 17:45:44 EDT
Installed the new kernel from koji. Removed all the work around, rebooted and ... not a single problem since two hours. Yet I played music from youtube, stress-tested the machine by processing a 10 Gig compressed archive, and so on.

Not a single trouble. Congrat for killing this bug.
Comment 14 Pascal Dupuis 2012-06-29 03:42:50 EDT
Wait a minut. The behaviour I observed with the previous kernel was a leakage. If you look a bit around recv.c, line 685, in the lastest kernel:
 685       if (ret == -EINVAL) {
 686                /* corrupt descriptor, skip this one and the following one */
 687                list_add_tail(&bf->list, &sc->rx.rxbuf);
 688                ath_rx_edma_buf_link(sc, qtype);
 690                skb = skb_peek(&rx_edma->rx_fifo);
 691                if (skb) {
 692                        bf = SKB_CB_ATHBUF(skb);
 693                        BUG_ON(!bf);
 695                        __skb_unlink(skb, &rx_edma->rx_fifo);
 696                        list_add_tail(&bf->list, &sc->rx.rxbuf);
 697                        ath_rx_edma_buf_link(sc, qtype);
 698                } else {
 699                        bf = NULL;
 700                }
 701        }

The idea is to remove the "else {" and the next "}". According to the code,two descriptors are skipped. Is there dynamic memory allocated through those decriptors? Is this memory freed before setting bf to NULL ?


Comment 15 John W. Linville 2012-06-29 10:40:20 EDT

*** This bug has been marked as a duplicate of bug 832927 ***

Note You need to log in before you can comment on or make changes to this bug.