Bug 749276 - AR9285 wireless card make system freezed
Summary: AR9285 wireless card make system freezed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Stanislaw Gruszka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-26 15:46 UTC by Flos Lonicerae
Modified: 2012-05-26 08:07 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-05-22 02:21:54 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
cmdline (191 bytes, text/plain)
2011-10-26 15:54 UTC, Flos Lonicerae
no flags Details
dmesg (66.75 KB, text/plain)
2011-10-26 15:54 UTC, Flos Lonicerae
no flags Details
iomem (2.12 KB, text/plain)
2011-10-26 15:55 UTC, Flos Lonicerae
no flags Details
installed kernel debugging packages (570 bytes, text/plain)
2011-10-26 15:55 UTC, Flos Lonicerae
no flags Details
kexec-tools version (905 bytes, text/plain)
2011-10-26 15:56 UTC, Flos Lonicerae
no flags Details
lsmod result (2.25 KB, text/plain)
2011-10-26 15:56 UTC, Flos Lonicerae
no flags Details
'uname -a' result (113 bytes, text/plain)
2011-10-26 15:58 UTC, Flos Lonicerae
no flags Details
first crash test on Fedora 16 (1.40 MB, image/jpeg)
2011-10-26 15:59 UTC, Flos Lonicerae
no flags Details
last test on Fedora 16, photo 1 (1.31 MB, image/jpeg)
2011-10-26 16:00 UTC, Flos Lonicerae
no flags Details
last test on Fedora 16, photo 2 (1.39 MB, image/jpeg)
2011-10-26 16:01 UTC, Flos Lonicerae
no flags Details
kdump.conf in /etc (5.13 KB, text/plain)
2011-10-26 16:16 UTC, Flos Lonicerae
no flags Details
atl1c_net_next_update-3.3.patch (123.02 KB, text/plain)
2012-05-15 11:09 UTC, Stanislaw Gruszka
no flags Details
atl1c_net_next_update-3.4.patch (123.77 KB, text/plain)
2012-05-15 11:11 UTC, Stanislaw Gruszka
no flags Details

Description Flos Lonicerae 2011-10-26 15:46:35 UTC
there is an old post which describes the AR9285 wireless card had a bug that can make the system freezed when the wireless card enabled. please see this:
https://bugzilla.redhat.com/show_bug.cgi?id=697157

i can install Fedora14 on my Lenovo G475 after Stanislaw Gruszka fixed the bug. but recently, i want to install Fedora 16 on my notebook, i find that the 'system-freeze' problem happen again. i try all old methods in the thread rhbz697157, but it helps nothing.

Stanislaw Gruszka was so kind that he told me how to get the kernel core dump for debugging. but it's a bit difficult for me to know how to do it. so i spent a few days to study kdump. now i will write down all problems i found these days:

I. using the system-config-kdump package. 
this tool can be run on RHEL6 correctly sometimes, but can NOT be run on Fedora 16 completely!!! when i try to run it on Fedora 16, it pops up a lot of error windows, and finally exited WITHOUT triggering the debug loging daemon -- 'abrtd'. so i cannot report bug for this tool. there is also an RHEL6 on my notebook, and i use it on my RHCE6 lessons. since the system-config-kdump can run on RHEL6, i think out a ugly method that first i compile the latest Fedora 16 kernel and its dependencies on RHEL6, all finished successfully! second, i boot the kernel-3.1.0-rc10 on RHEL6, and run system-config-kdump to configure kdump. it seems that the initrd-kdump ramfs can be found in my /boot. then i appended "crashkernel=512M" to the kernel's boot prarms, reboot, and see if there is a "Crash Kernel" value in /proc/iomem, it did do the magic. then i test with:
echo "1"> /proc/sys/kernel/sysrq  # this step ok
echo c > /proc/sysrq-trigger
the kernel crashed, but it said that there is a bug so that it can not regain a bash shell. then a kernel panic without any core files dumped...

i also try the 2.6 kernel which carried by RHEL6 itself, following the same steps, it can produce vmcore in my /var/crash/<crash time> directory.

II. using the kexec-tools directly
after i read some tutorials on the web. i decided to manually configure the kdump on my Fedora 16. first i disable my wireless card, update my Fedora 16 to the latest kernel, and install all the debuginfo of the kernel. install the kexec-tools. here, i found that the kexec-tools cannot be install correctly!!! this is because the kdump.service is corrupted -- the [Install] section cannot be found, so that systemd cannot enable it. but all binary files of the kexec-tools were ok. so, after i modified the kdump.conf and appended boot params 'crashkernel=512M' to grub.cfg of /boot/grub2, reboot and run'kdumpctl restart' manually, it then create an 'initramfs-3.1.0-1.fc16.i686.debugkdump.img' in my /boot, and then i run 'service kdump status', it said 'Kdump is operational'. then i check the /proc/iomem and it shows the crash kernel is there! i try to test it:
echo "1"> /proc/sys/kernel/sysrq  # this step ok
echo c > /proc/sysrq-trigger
the kernel crashed, but it did *NOT* produce vmcore either, leaving my keyboard lights blinking.

the attachments are envirments and configurations of the last test on my Fedora 16. pictures are the results of my last two tests.

thanks!

Flos

Comment 1 Flos Lonicerae 2011-10-26 15:54:05 UTC
Created attachment 530310 [details]
cmdline

Comment 2 Flos Lonicerae 2011-10-26 15:54:29 UTC
Created attachment 530311 [details]
dmesg

Comment 3 Flos Lonicerae 2011-10-26 15:55:01 UTC
Created attachment 530312 [details]
iomem

Comment 4 Flos Lonicerae 2011-10-26 15:55:52 UTC
Created attachment 530313 [details]
installed kernel debugging packages

Comment 5 Flos Lonicerae 2011-10-26 15:56:24 UTC
Created attachment 530314 [details]
kexec-tools version

Comment 6 Flos Lonicerae 2011-10-26 15:56:50 UTC
Created attachment 530316 [details]
lsmod result

Comment 7 Flos Lonicerae 2011-10-26 15:58:01 UTC
Created attachment 530317 [details]
'uname -a' result

Comment 8 Flos Lonicerae 2011-10-26 15:59:20 UTC
Created attachment 530318 [details]
first crash test on Fedora 16

Comment 9 Flos Lonicerae 2011-10-26 16:00:29 UTC
Created attachment 530319 [details]
last test on Fedora 16, photo 1

Comment 10 Flos Lonicerae 2011-10-26 16:01:28 UTC
Created attachment 530320 [details]
last test on Fedora 16, photo 2

Comment 11 Flos Lonicerae 2011-10-26 16:09:10 UTC
btw, when i did the crash test, i would first made the network service disabled, then reboot, and make my wireless card useable in bios, save the option and reboot. in this way, the wireless driver was loaded but the network service is stopped, so that i can do the crash test. otherwise, the system will soon be frezzed if the wireless card making connection.

Comment 12 Flos Lonicerae 2011-10-26 16:16:31 UTC
Created attachment 530322 [details]
kdump.conf in /etc

Comment 13 Stanislaw Gruszka 2011-10-29 12:51:25 UTC
I'm sad to hear that kdump does not work.

> the kernel crashed, but it said that there is a bug so that it can not regain a
> bash shell. then a kernel panic without any core files dumped...

So that (RHEL6 + 3.1-rc kernel) was almost successful, secondary (kdump) kernel boot, but did not make a dump? So lets try to fix that, to allow to catch ath9k bug. If you modify /etc/kdump.conf to have "default shell" instead of "default reboot", you will be prompted to small shell that allow to find out why dump fail (i.e. if mount fail or there problem with read /proc/vmcore, etc). BTW: Does your custom 3.1-rc kernel have compiled CONFIG_PROC_KCORE and CONFIG_PROC_VMCORE ?

If there will be no option to force kdump work, perhaps it could be possible to make a photo of a crash:

- first blacklist ath9k module in /etc/modprobe.d/blacklist.conf
- enable wireless in bios
- boot system and log to virtual terminal (Alt+Ctrl+F2) 
- login as root
- and do modprobe ath9k
- this should trigger a crash, which should show calltrace

Comment 14 Stanislaw Gruszka 2011-10-29 13:07:06 UTC
One more thing. Kdump need separate blacklisting of module/s , to make kdump kernel does not crash in ath9k driver, you have to add 

blacklist ath9k 

line to /etc/kdump.conf (and restart kdump service).

You can blacklist other modules that are not needed i.e uvcvideo, snd_*, .... The more modules blacklisted, there is more chance that kdump kernel will boot properly and more memory for successful dump process. Note radeon module is needed for proper display initialization, if blacklisted it could cause only that you will not see kdump kernel booting, but in worse scenario it that could make kdump kernel fail to start.

Comment 15 Stanislaw Gruszka 2012-02-26 12:14:28 UTC
Flos, there are various ath9k fixes committed since last 10/2011, does the problem still occurs in latest fedora 16 kernel 3.2 ?

Comment 16 Flos Lonicerae 2012-02-27 16:12:30 UTC
Hi Stanislaw,

this problem still occurs in latest fedora 16 kernel 3.2. but i think i've find out the problem. when i both enable my wireless network adapter and wired network adapter, the system MUST will be freezed! but if i blacklist the atl1c module in /etc/modprobe.d/blacklist.conf, and regenerate the initramfs, the problem disappear at all, although at this time my wired network adapter cannot be used due to the atl1c module not loaded.

all i can do is this. without the wired network adapter module, the wireless card can work without any problem. the same situation is that if i disabled my wireless card in bios settings or blacklist the ath and ath9k modules, the wired network adapter can work smoothly! what a strange problem.

since i setup a wifi ap at home, now i was working with blacklisting the atl1c module.

i always wonder if the problem can be finally resolved. ;)

Flos

Comment 17 Stanislaw Gruszka 2012-02-29 12:46:48 UTC
Good catch, atl1c do some ASPM quirks on PCI bridges, same as ath9k. Apparently both of them should not do this, but relay on system settings, or change things but make pci core aware of the changes. 

> i always wonder if the problem can be finally resolved. ;)

I plan to work on that, but not sure when.

Comment 18 Flos Lonicerae 2012-02-29 13:15:34 UTC
Thanks! Waiting for your good news!

Comment 19 Dave Jones 2012-03-22 16:55:51 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 20 Dave Jones 2012-03-22 16:59:32 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 21 Dave Jones 2012-03-22 17:10:54 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 22 Stanislaw Gruszka 2012-03-22 20:02:35 UTC
This is not fixed...

Comment 23 Stanislaw Gruszka 2012-05-03 08:19:43 UTC
I did not worked on it yet, but saw that atheros developer post atl1c patches to net-next that changes various register programming code on alt1c, including ASPM. Below kernel build include atl1c driver update from net-next. Please check if that solve the problem:

http://koji.fedoraproject.org/koji/taskinfo?taskID=4045082

Comment 24 Flos Lonicerae 2012-05-03 15:55:16 UTC
(In reply to comment #23)
> I did not worked on it yet, but saw that atheros developer post atl1c patches
> to net-next that changes various register programming code on alt1c, including
> ASPM. Below kernel build include atl1c driver update from net-next. Please
> check if that solve the problem:
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=4045082

Hi Stanislaw,

I downloaded the kernel you just built, and installed. It seems that it has resolved the problem!!! :D
I had to go to bed now, and I'll post the detailed process what I do. have a nice day!

Flos

Comment 25 Stanislaw Gruszka 2012-05-15 11:09:42 UTC
Created attachment 584639 [details]
atl1c_net_next_update-3.3.patch

atl1c update from net-next for 3.3 kernel

Comment 26 Stanislaw Gruszka 2012-05-15 11:11:54 UTC
Created attachment 584640 [details]
atl1c_net_next_update-3.4.patch

atl1 from net next update for 3.4 kernel

Comment 27 Stanislaw Gruszka 2012-05-15 11:16:38 UTC
Josh, please apply above 3.3 patch as fix for this bug. I attached also patch for 3.4 kernel in case fedora will update to that kernel (it apply cleanly on 3.4-rc7)

This update include mostly register programming fixes from Atheros.

Comment 28 Josh Boyer 2012-05-15 12:05:15 UTC
Applied on all branches.  Thanks Stanislaw!

Comment 29 Fedora Update System 2012-05-17 13:45:30 UTC
kernel-3.3.6-3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.3.6-3.fc16

Comment 30 Fedora Update System 2012-05-17 13:47:03 UTC
kernel-3.3.6-3.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.3.6-3.fc17

Comment 31 Fedora Update System 2012-05-17 22:56:13 UTC
Package kernel-3.3.6-3.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.3.6-3.fc17'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-7974/kernel-3.3.6-3.fc17
then log in and leave karma (feedback).

Comment 32 Fedora Update System 2012-05-22 02:21:54 UTC
kernel-3.3.6-3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 33 Fedora Update System 2012-05-26 08:07:39 UTC
kernel-3.3.6-3.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.