Bug 754305 - Kernel deadlocks when both brcm80211 and atl1c modules installed
Summary: Kernel deadlocks when both brcm80211 and atl1c modules installed
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-16 02:34 UTC by Darryl Bond
Modified: 2012-07-13 23:27 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-13 23:27:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Darryl Bond 2011-11-16 02:34:23 UTC
Description of problem: Acer AO522 Netbook
Kernel deadlocks which requires a hard reset when both the atl1c ethernet module and bcrm80211 wireless module is enabled after boot.


Version-Release number of selected component (if applicable):
00:00.0 Host bridge: Advanced Micro Devices [AMD] Family 14h Processor Root Complex
00:01.0 VGA compatible controller: ATI Technologies Inc Device 9804
00:01.1 Audio device: ATI Technologies Inc Wrestler HDMI Audio [Radeon HD 6250/6310]
00:11.0 SATA controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]
00:12.0 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.2 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.2 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 42)
00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia (Intel HDA) (rev 40)
00:14.3 ISA bridge: ATI Technologies Inc SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge (rev 40)
00:15.0 PCI bridge: ATI Technologies Inc SB700/SB800 PCI to PCI bridge (PCIE port 0)
00:15.2 PCI bridge: ATI Technologies Inc Device 43a2
00:15.3 PCI bridge: ATI Technologies Inc Device 43a3
00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 0 (rev 43)
00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 6
00:18.6 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 5
00:18.7 Host bridge: Advanced Micro Devices [AMD] Family 12h/14h Processor Function 7
06:00.0 Ethernet controller: Atheros Communications AR8152 v2.0 Fast Ethernet (rev c1)
07:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)

Kernel Version 3.1.1-1.fc16.x86_64


Module                  Size  Used by
tcp_lp                  2111  0 
ppdev                   7360  0 
parport_pc             19600  0 
lp                      9581  0 
parport                32310  3 ppdev,parport_pc,lp
fuse                   61671  3 
bnep                   14195  2 
bluetooth             202902  7 bnep
ip6t_REJECT             4008  2 
nf_conntrack_ipv6       7714  1 
nf_defrag_ipv6          9115  1 nf_conntrack_ipv6
xt_state                1306  1 
nf_conntrack           67597  2 nf_conntrack_ipv6,xt_state
ip6table_filter         1655  1 
ip6_tables             16776  1 ip6table_filter
arc4                    1417  2 
brcmsmac              497715  0 
mac80211              251806  1 brcmsmac
brcmutil                4513  1 brcmsmac
uvcvideo               56989  0 
videodev               78689  1 uvcvideo
snd_hda_codec_conexant    55283  1 
cfg80211              151125  2 brcmsmac,mac80211
media                  11511  2 uvcvideo,videodev
snd_hda_codec_hdmi     23548  1 
sparse_keymap           3358  0 
rfkill                 16336  4 bluetooth,cfg80211
microcode              18539  0 
joydev                  9567  0 
snd_hda_intel          24072  1 
v4l2_compat_ioctl32     7665  1 videodev
crc8                    1356  1 brcmsmac
k10temp                 3295  0 
cordic                  1150  1 brcmsmac
snd_hda_codec          85181  3 snd_hda_codec_conexant,snd_hda_codec_hdmi,snd_hda_intel
snd_hwdep               6264  1 snd_hda_codec
snd_seq                52186  0 
snd_seq_device          5941  1 snd_seq
snd_pcm                78498  3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
sp5100_tco              5261  0 
i2c_piix4              10502  0 
snd_timer              19372  2 snd_seq,snd_pcm
snd                    63124  11 snd_hda_codec_conexant,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_seq,snd_seq_device,snd_pcm,snd_timer
soundcore               6267  1 snd
snd_page_alloc          7311  2 snd_hda_intel,snd_pcm
uinput                  7230  0 
video                  12388  0 
wmi                     9049  0 
radeon                691713  3 
ttm                    55029  1 radeon
drm_kms_helper         26490  1 radeon
drm                   194476  5 radeon,ttm,drm_kms_helper
i2c_algo_bit            4958  1 radeon
i2c_core               25728  6 videodev,i2c_piix4,radeon,drm_kms_helper,drm,i2c_algo_bit



How reproducible:
Problem has been there on this hardware Fedora 15 on proprietary STA wl driver and now on F16 and kernel 3.1.1-1 using the bcrm80211.

I logged the problem on the RPM Fusion site but the lockup occurs on the open Broadcom driver as well, so I suspect the problem to be the atl1c driver.

Steps to Reproduce:
1. Default install
2. Boot the computer
3. Log in and attempt to use wireless
  
Actual results:
Netbook deadlocks and does not log anything to syslog. I can only recover with a power cycle. I cannot see any meaningful messages in syslog or on the console.

Expected results:
No lockup.

Additional info:
Lockups do not occur if I blacklist atl1c.
I can also modprobe atl1c after a successful boot and use of the wireless and use the ethernet port.

This workaround also applied to the wl driver!!

Comment 1 John W. Linville 2011-11-16 21:51:23 UTC
Does blacklisting the brcmsmac driver also allow for a successful boot?

Comment 2 Darryl Bond 2011-11-17 00:49:16 UTC
yes, it works fine with just the atl1c.

I did some more testing yesterday. If I boot with the atl1c blacklisted and probe it in after I have logged in and connected over wireless, it runs fine off the wireless for several minutes, but eventually locks solid.

Comment 3 John W. Linville 2011-11-17 15:01:24 UTC
Could you boot as you describe in comment 2, then capture the contents of /proc/interrupts (before it locks-up!) and post that here?

Comment 4 Darryl Bond 2011-11-17 21:10:14 UTC
Wireless operating and logged in

[root@Netbook ~]# cat /proc/interrupts 
           CPU0       CPU1       
  0:        141          1   IO-APIC-edge      timer
  1:         18        352   IO-APIC-edge      i8042
  7:          1          0   IO-APIC-edge    
  8:          1          0   IO-APIC-edge      rtc0
  9:          1        227   IO-APIC-fasteoi   acpi
 12:         90       3544   IO-APIC-edge      i8042
 16:          1        330   IO-APIC-fasteoi   snd_hda_intel
 17:          1         36   IO-APIC-fasteoi   ehci_hcd:usb1, ehci_hcd:usb2
 18:          0          0   IO-APIC-fasteoi   ohci_hcd:usb3, ohci_hcd:usb4
 19:        364      19104   IO-APIC-fasteoi   ahci, brcmsmac
 40:        153      11592   PCI-MSI-edge      radeon
 41:          1        106   PCI-MSI-edge      snd_hda_intel
NMI:          4          4   Non-maskable interrupts
LOC:      50459      59470   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          4          4   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RES:      17456      13181   Rescheduling interrupts
CAL:        251        147   Function call interrupts
TLB:        782        700   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          1          1   Machine check polls
ERR:          1
MIS:          0
[root@Netbook ~]# modprobe atl1c
[root@Netbook ~]# cat /proc/interrupts 
           CPU0       CPU1       
  0:        141          1   IO-APIC-edge      timer
  1:         18        352   IO-APIC-edge      i8042
  7:          1          0   IO-APIC-edge    
  8:          1          0   IO-APIC-edge      rtc0
  9:          1        227   IO-APIC-fasteoi   acpi
 12:         90       3544   IO-APIC-edge      i8042
 16:          1        330   IO-APIC-fasteoi   snd_hda_intel
 17:          1         36   IO-APIC-fasteoi   ehci_hcd:usb1, ehci_hcd:usb2
 18:          0          0   IO-APIC-fasteoi   ohci_hcd:usb3, ohci_hcd:usb4
 19:        417      20185   IO-APIC-fasteoi   ahci, brcmsmac
 40:        167      13021   PCI-MSI-edge      radeon
 41:          1        106   PCI-MSI-edge      snd_hda_intel
 42:          0          0   PCI-MSI-edge      eth1
NMI:          4          4   Non-maskable interrupts
LOC:      52039      62372   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          4          4   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RES:      17858      13560   Rescheduling interrupts
CAL:        259        159   Function call interrupts
TLB:        788        707   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          1          1   Machine check polls
ERR:          1
MIS:          0

Comment 5 Darryl Bond 2011-11-17 21:16:21 UTC
Tail end of dmesg with both modules inserted:

[   30.782124] wlan0: authenticate with 00:13:10:92:16:60 (try 1)
[   30.785431] wlan0: authenticated
[   30.789455] wlan0: associate with 00:13:10:92:16:60 (try 1)
[   30.796398] wlan0: RX AssocResp from 00:13:10:92:16:60 (capab=0x411 status=0 aid=8)
[   30.796412] wlan0: associated
[   30.798105] ieee80211 phy0: brcms_ops_bss_info_changed: qos enabled: true (implement)
[   30.798121] ieee80211 phy0: brcmsmac: brcms_ops_bss_info_changed: associated
[   30.798135] ieee80211 phy0: changing basic rates failed: -22
[   30.798147] ieee80211 phy0: brcms_ops_bss_info_changed: arp filtering: enabled true, count 0 (implement)
[   30.800092] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   34.708703] hda-intel: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
[   38.857025] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)
[   38.857056] SELinux: initialized (dev dm-2, type ext4), uses xattr
[   41.618310] wlan0: no IPv6 routers present
[   51.005793] ieee80211 phy0: brcms_ops_bss_info_changed: arp filtering: enabled true, count 1 (implement)
[   86.864226] fuse init (API version 7.17)
[   86.900567] SELinux: initialized (dev fusectl, type fusectl), uses genfs_contexts
[   86.919754] SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
[   95.615216] lp: driver loaded but no devices found
[   95.706620] ppdev: user-space parallel port driver
[  137.512375] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[  137.697883] ip6_tables: (C) 2000-2006 Netfilter Core Team
[  275.206922] atl1c 0000:06:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[  275.207057] atl1c 0000:06:00.0: setting latency timer to 64
[  275.309420] atl1c 0000:06:00.0: version 1.0.1.0-NAPI
[  275.347928] udevd[1535]: renamed network interface eth0 to eth1
[  275.384415] atl1c 0000:06:00.0: irq 42 for MSI/MSI-X
[  275.473859] ADDRCONF(NETDEV_UP): eth1: link is not ready

Comment 6 John W. Linville 2011-12-02 15:15:18 UTC
From "Arend van Spriel" <arend>:

"I had a look in our driver and it is pretty straight-forward. Checks if
the interrupt is ours look at the chip interrupt status and return
accordingly.

In atl1c a do-while look is used in the isr with a number of possible
code paths changing the return value. So I don't dare to predict that
behaviour. Not seeing an obvious mistake there though."

Comment 7 Sean Sheedy 2011-12-08 23:21:30 UTC
I see the same thing on an Acer Aspire One using the same drivers in Fedora 15.  Additional data:

wifi on, ether plugged in:  works fine
wifi off, ether unplugged:  works fine

wifi on, ether unplugged, boot order HDD first:      hangs
wifi on, ether unplugged, boot order network first:  works fine

Note that with the network first in the boot order, the BIOS accesses the ethernet controller to search for a PXE boot image.  This implies to me that the likely culprit is the atl1c driver not initializing something.

So setting the network first in the boot order will work around the problem, although it seems potentially dangerous if you're plugged into a public ethernet.

Comment 8 Darryl Bond 2011-12-13 23:01:59 UTC
Works for me too.
* Updated to 3.1.5 kernel
* Remove blacklist from atl1c
* reboot
* Deadlocks at logon screen
* Restart
* Change BIOS to PXE boot first
* Cable unplugged so PXE fails immediately
* Continues to boot from HD
* Boot to login works fine
* Both atl1c and wireless modules installed together.

Comment 9 Darryl Bond 2011-12-28 02:06:53 UTC
Updated to 3.1.6-1.f16.x86_64.
The fault seems to have gone away. 
I no longer need to blacklist one or the other, nor do I need to enable PXE boot.

I notice that the brcmsmac modules versions changed while the atl1c has not.

Comment 10 Darryl Bond 2011-12-29 07:19:27 UTC
oops, yes it does. Just not every time.
It locked up again. I suppose I had 6 successful boots beforehand though.

Comment 11 Darryl Bond 2012-07-13 23:27:52 UTC
Somewhere around 3.4.3 this bug was fixed. It no longer requires PXE boot enabled to prevent the deadlocks.


Note You need to log in before you can comment on or make changes to this bug.