436538 – NetworkManager causes disconnect: "Unexpected scan data length..."

Bug 436538 - NetworkManager causes disconnect: "Unexpected scan data length..."

Summary: NetworkManager causes disconnect: "Unexpected scan data length..."

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	8
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	John W. Linville
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-03-07 19:32 UTC by Evan McNabb
Modified:	2008-03-11 02:53 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-03-11 02:53:46 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Evan McNabb 2008-03-07 19:32:19 UTC

Description of problem:

I am able to connect to my AP with NetworkManager just fine, but eventually it
gets disconnected. This happens almost immediately if large amounts of data are
being transfered. NetworkManager cannot reconnect and the only solution is to
reload the wireless modules and restart NM.

This could definitely be a orinoco driver issue, but it only happens when
NetworkManager is running. If the ESSID, WEP key, etc are statically configured
it works as expected. If this is a driver issue, I can change the Component to
kernel.

Version-Release number of selected component (if applicable):

NetworkManager-gnome-0.7.0-0.6.7.svn3370.fc8
NetworkManager-openvpn-0.7.0-8.svn3302.fc8
NetworkManager-0.7.0-0.6.7.svn3370.fc8
NetworkManager-glib-0.7.0-0.6.7.svn3370.fc8
libnl-1.1-1.fc8

This occurs on both 2.6.24.3-12.fc8 (i686) 2.6.23.15-137.fc8 (i686) kernels.

# lsmod |grep orinoco
orinoco_pci             8897  0
orinoco                39125  1 orinoco_pci
hermes                  9921  2 orinoco_pci,orinoco

# lspci |grep Network
02:02.0 Network controller: Intersil Corporation Prism 2.5 Wavelan chipset (rev 01)

How reproducible:
On this particular laptop, it can be easily triggered by normal usage.

Steps to Reproduce:
1. Connect to AP
2. Copy data to other host on network
  
Actual results:
Network disconnects and cannot be rejoined.

Expected results:
It should not get disconnected.

Additional info:

Is this possibly related to these?
https://bugzilla.redhat.com/show_bug.cgi?id=317691
http://marc.info/?l=linux-wireless&m=119194675024816&w=2

I tried setting "options orinoco ignore_disconnect=1" which didn't help.

For reference, the following messages exist on a statically defined,
NetworkManager-less connection (even though there are status changes, the
network continues to function from a user's perpective):
============
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Tx timeout! ALLOCFID=ffff, TXCOMPLFID=ffff, EVSTAT=8000
eth1: New link status: UNKNOWN (0008)
eth1: New link status: Connected (0001)
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Tx timeout! ALLOCFID=ffff, TXCOMPLFID=ffff, EVSTAT=8000
eth1: New link status: UNKNOWN (0008)
eth1: New link status: Connected (0001)
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Tx timeout! ALLOCFID=ffff, TXCOMPLFID=ffff, EVSTAT=8000
eth1: New link status: UNKNOWN (0008)
eth1: New link status: Connected (0001)
============

Here are messages from two sample disconnects:

============
eth1: Information frame lost.
printk: 932 messages suppressed.
eth1: Information frame lost.
printk: 789 messages suppressed.
eth1: Information frame lost.
eth1: Information frame lost.
eth1: Information frame lost.
eth1: IRQ handler is looping too much! Resetting.
eth1: IRQ handler is looping too much! Resetting.
eth1: New link status: UNKNOWN (0008)
eth1: Invalid atom_len in scan data: 5
eth1: Unexpected scan data length 66, atom_len 55029, offset 4
eth1: New link status: Disconnected (0002)
eth1: Unexpected scan data length 190, atom_len 42714, offset 4
eth1: Unexpected scan data length 190, atom_len 23666, offset 4
eth1: Unexpected scan data length 190, atom_len 304, offset 4
eth1: Unexpected scan data length 128, atom_len 40252, offset 4
eth1: Unexpected scan data length 128, atom_len 22889, offset 4
eth1: Unexpected scan data length 190, atom_len 703, offset 4
eth1: Unexpected scan data length 252, atom_len 62032, offset 4
eth1: Unexpected scan data length 128, atom_len 953, offset 4
eth1: Unexpected scan data length 252, atom_len 2867, offset 4
eth1: Unexpected scan data length 128, atom_len 62032, offset 4
eth1: Unexpected scan data length 190, atom_len 5094, offset 4
eth1: Unexpected scan data length 128, atom_len 5095, offset 4
============

============
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Tx timeout! ALLOCFID=ffff, TXCOMPLFID=ffff, EVSTAT=8000
eth1: New link status: UNKNOWN (0008)
eth1: Unexpected scan data length 66, atom_len 49160, offset 4
eth1: Unexpected scan data length 66, atom_len 24632, offset 4
eth1: New link status: Disconnected (0002)
eth1: Unexpected scan data length 190, atom_len 39518, offset 4
eth1: Unexpected scan data length 252, atom_len 215, offset 4
eth1: Unexpected scan data length 190, atom_len 214, offset 4
eth1: Unexpected scan data length 190, atom_len 65019, offset 4
eth1: Unexpected scan data length 314, atom_len 19206, offset 4
eth1: Invalid atom_len in scan data: 0
============

Comment 1 Dan Williams 2008-03-07 22:27:27 UTC

Most definitely a driver issue; moving to kernel.

That said, since your card is a prism 2.5, you can use the hostap driver which
might work better.  Could you try that?  Add "orinoco" and "orinoco_pci" to the
modprobe blacklist (/etc/modprobe.d/blacklist) and let hostap drive the card.

If you ever have to rmmod the module to get stuff working again, it's usually
some driver issue.  It looks here like the card's firmware is getting hung up. 
Those atom lengths also look really, really wrong.  Something isn't right in the
firmware here.

Can you report the firmware versions that your card is using by including the
relevant parts of the 'dmesg' ?  It should report the firmware there when you
start up.

Comment 2 Evan McNabb 2008-03-11 02:53:46 UTC

Thanks for the good advice! It looks like a firmware update fixed it. I'll
record what I did in case others stumble across this in the future. :-)

I initially tried using the hostap driver but had similar issues with lots of
disconnects and invalid packet messages. NM could scan and see the AP, but was
never able to connect. Statically configured connections seemed to work ok.

I then went the firmware update route by installing hostap-utils from ATrpms and
downloading the firmware files from http://linux.junsun.net/intersil-prism/. The
steps to view and flash the firmware using hostap-utils were:

============
# hostap_diag wifi0
Host AP driver diagnostics information for 'wifi0'

NICID: id=0x8022 v1.0.0 (PRISM III Mini-PCI (SST parallel flash))
PRIID: id=0x0015 v1.1.1
STAID: id=0x001f v1.5.6 (station firmware)

# prism2_srec -v -O /proc/net/hostap/wlan0/pda -f wifi0 PK010101.HEX SF010802.HEX
. . .
Components after download:
  NICID: 0x8022 v1.0.0
  PRIID: 0x0015 v1.1.1
  STAID: 0x001f v1.8.2
============

After updating, I rolled back to the orinoco driver since I knew that was
working previously. I have yet to see any disconnects and errors in dmesg.

Thanks again for the help.

Note You need to log in before you can comment on or make changes to this bug.