Bug 453124 - after resume, wireless connection fails to reconnect
Summary: after resume, wireless connection fails to reconnect
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 9
Hardware: i386
OS: Linux
low
medium
Target Milestone: ---
Assignee: Dan Williams
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 456039 478588 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-27 13:01 UTC by Jonathan Pritchard
Modified: 2009-07-14 17:53 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2009-07-14 17:53:10 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
selinux troubleshoot message output after making the /etc/pm/config.d file suspend_modules (2.56 KB, text/plain)
2008-07-02 14:52 UTC, Jonathan Pritchard
no flags Details
2nd setroubleshoot output, pm-suspend (2.57 KB, text/plain)
2008-07-05 21:01 UTC, Jonathan Pritchard
no flags Details

Description Jonathan Pritchard 2008-06-27 13:01:10 UTC
NetworkManager after resuming from a suspend fails to reconnect to previously
connected wireless network.

Instead, repeatedly requests the network's password. Even when entered correctly
multiple times, it persists in this behaviour.

After a period of failed network-password entries, despite being correct,
NetworkManager appears to disable the wlan0 device in system-config-network. 

It is reproducible all the time. 


Hardware: Thinkpad T41p. Atheros chipset. 

Software: Madwifi driver 0.9.4 from livna. Latest updates as of (27/06/08)
including latest kernel '2.6.25.6-55.fc9.i686'.

Wireless network: WPA2-PSK AES encryption.

Reproducible all the time.

Steps to Reproduce:
1. Connect to wireless network (or automatically done at startup)
2. Suspend.
3. Log back in.

Expected results:
After resuming and logging in, NetworkManager should reconnect quickly to the
previously connected wireless network, as nothing has changed, not the network,
the password - nothing.

I am more than happy to provide any further information, but am rather new to
Linux and Fedora. Overall a much better experience with Fedora than in Core 5
and 6 though, including NetworkManager working for me apart from this issue. Thanks.

Comment 1 Jonathan Pritchard 2008-07-02 14:43:17 UTC
I tested this on a WEP-128bit encrypted network and had the same bad luck, it
did not resume after a suspend. This time, it didn't even ask for the network
secret.

However, the WEP encrypted network's secret I entered in hex and not ascii which
I used on my WPA2-PSK network which leads me to believe this might be a factor.

Someone else on fedoraforum.org had a similar problem and suggested I add a file
titled 'suspend_modules' in /etc/pm/config.d - this didn't remedy anything at
first throwing up an SELINUX troubleshoot message which I will attach. This
might be a problem as well, because it was after a combination of these things
that the network resumes properly.

My steps to remedy it so far are summarised below:

1. Use a hex key instead of an ascii key.
2. Add a file titled suspend_modules in /etc/pm/config.d containing the line:

SUSPEND_MODULES='ath_pci'.
3. Put SELINUX into permissive mode to get around the setroubleshooter message.

I will try and whittle down which of these changes made it work.

Comment 2 Jonathan Pritchard 2008-07-02 14:52:59 UTC
Created attachment 310796 [details]
selinux troubleshoot message output after making the /etc/pm/config.d file suspend_modules

Comment 3 Jonathan Pritchard 2008-07-03 10:02:17 UTC
I just tested on my WPA2-PSK network which I use a passphrase with, as opposed
to the hex key I used on the WEP network.

The WPA2-PSK network is the one where NM repeatedly asked for my network secret,
despite it being correct. It worked fine after the changes documented in
previous comments. This rules out my hex vs. ascii argument for explaining why
NM repeatedly requested the network secret, but never connected.

So the workaround for this bug is as follows, and should be somehow incorporated
into the next release if possible. If someone could explain why this works,
someone who knows more, that'd be very helpful also.

1. Set SELINUX into permissive mode, to workaround the warning that it will
generate (the setroubleshooter output is attached to this bug, creating an
exception by default in selinux would be good, so we can keep enforcing mode).

2. Create a file in '/etc/pm/config.d' called 'suspend_modules'.

3. Add the following line to the file:

SUSPEND_MODULES='ath_pci'

--

Notice how this bug is specific to madwifi, I do not have any evidence or
facilities to test if this bug occurs on other wireless chipsets. Another user
johannlo with a T42 with an atheros chipset but using ath5k also encountered
this: http://forums.fedoraforum.org/showthread.php?p=1040305#post1040305

Comment 4 Jonathan Pritchard 2008-07-05 20:58:52 UTC
setroubleshoot recommends that I should run "restorecon -v './suspend_modules'"
- inside the speech marks. You need to navigate to /etc/pm/config.d/ before
running it.

I also tested the fix by removing the suspend_modules file from
/etc/pm/config.d/ and suspending in both permissive and enforcing selinux modes.
Without this file, wlan0 just wouldn't come back up.

I notice now in setroubleshoot that there are two logs, the other log suggests
that the user does the following - "restorecon -v
'/etc/pm/config.d/suspend_modules'" .

I'll attach the log for this second problem too.

Collectively, the reports sources' refer to both pm-powersave and pm-suspend.

Comment 5 Jonathan Pritchard 2008-07-05 21:01:47 UTC
Created attachment 311082 [details]
2nd setroubleshoot output, pm-suspend

Comment 6 Jonathan Pritchard 2008-07-10 01:29:26 UTC
Strangely, the fix does not appear to work if the laptop has been suspended for
a long period of time. One can workaround this by suspending and resuming it
again. Then, after a short period of time networkmanager will resume the
connection also. So still not fixed 100% - would appreciate comments from
developers. I'm more than willing to try out possible fixes.

Comment 7 Arthur Pemberton 2008-07-27 05:23:16 UTC
*** Bug 456039 has been marked as a duplicate of this bug. ***

Comment 8 Arthur Pemberton 2008-07-27 05:33:02 UTC
I have marked my Bug 456039 as a duplicate of this.

Smolt profile for the machine in question:
http://www.smolts.org/client/show/pub_4e28d6bb-5fee-4152-a260-f08ff80399e5

rmmod ath_pci && modprobe ath_pci after waking up the machine also works for me,
however doing so with either of the following methods causes the machine to not
wake up properly (power light stops blinking, wifi light comes on, no hdd
activity, no display)
 * using /etc/pm/config.d/suspend_modules
 * using /etc/pm/sleep.d/ath_pci.sh with the a script to rmmod,modprobe

So the best possible fix so far makes the fix useless. I have to hard restart
when the machine fails to wakeup.

SELinux is disabled.

This is on a compaq presario, so I would appreciate if the summary was made more
generic as this is not limited to Thinkpads

Comment 9 Jonathan Pritchard 2008-07-29 17:28:52 UTC
I am now experiencing the wlan0 interface not coming back up sometimes. I can
immediately tell if this is the case upon resume as I have the netspeed_applet
in my gnome-panel and instead of giving me current upstream/downstream bitrates
it shows a red stop sign icon, as if there is no wireless interface active.

The workaround for this is to just suspend and resume again, it fixes it every
time. Although I am not sure what causes the fix to not work at times. The only
pattern I can see is that it often occurs when the machine has been in suspend
for a long time.

On a separate note, I've changed the summary to make it not Thinkpad specific.

Comment 10 Arthur Pemberton 2008-08-07 04:27:16 UTC
Any updates on this? The user for whom I setup wifi still cannot use Fedora 10 regularly.

Comment 11 Jonathan Pritchard 2008-08-08 00:59:29 UTC
Sorry was that directed at me? Did you say Fedora 10? I don't know what causes this or why the workaround works (or not as the case is sometimes). It certainly makes it usable but it's not an ideal situation for a new user.

Can anyone with more experience explain why the workaround works and what we'd need to do to get this solved in future releases?

What problems are you having Arthur?

Comment 12 Arthur Pemberton 2008-08-08 01:15:09 UTC
(In reply to comment #11)
> Sorry was that directed at me?

To whom ever has useful information on this.

> Did you say Fedora 10?

Yes, please see comment 8 for snotr profile of the machine in question.

> I don't know what causes
> this or why the workaround works (or not as the case is sometimes). It
> certainly makes it usable but it's not an ideal situation for a new user.

In this case, the work around is worse than the problem.

> Can anyone with more experience explain why the workaround works and what we'd
> need to do to get this solved in future releases?

At this point I am unsure if it is a kernel module issue or a NetworkManager issue -- instructions on how to determine which is as fault would be helpful.

> What problems are you having Arthur?

In my case, reloading the module after resume works, however, all attempts to use automated module reloading causes the machine to freeze, requiring a hard reboot (and no logs are written).

Comment 13 Dan Williams 2008-08-08 15:52:08 UTC
Fedora 10 (rawhide) is pretty rocky right now while some wireless stuff gets hashed out upstream in the kernel.

For Fedora 9, we need a bit more investigation of what's happening at resume time.  First, the device itself needs to be "up" to scan, and you need to scan to figure if the network you want to reconnect to is still there.  NM should ensure that the device is up before firing off a scan, but I got a report yesterday that there's something odd here.  If anyone running Fedora 9 could try the NetworkManager in F9-updates-testing that would be useful, as there were fixes made in the suspend/resume areas that could help this problem out.

Next, we'd need logs from /var/log/messages to ensure that NM is doing the right thing on resume.  If it is doing the right thing, then we need to make sure that the driver is doing the right thing too...

One more test to do; after resume when NM has failed to reconnect to the AP, manually run (as root) "/sbin/iwlist wlan0" and see if you get any results.

Comment 14 Dan Williams 2008-10-20 15:43:53 UTC
Jonathan, can you try with latest kernel (2.6.26.5 or later), latest NM (svn4022.4 or later), and latest supplicant (0.6.4 or later)?  Also, please try the 'ath5k' driver in the 2.6.26 kernels, madwifi isn't upstream and thus doesn't use the same interfaces as ath5k does, which uses the standard kernel 802.11 stack.

If there are SELinux issues, please file a separate bug (since it's a different problem) and I'll get that fixed in the selinux policy.

Comment 15 Aznar 2008-10-22 12:50:54 UTC
Hi,

I'm testing Fedora 10 on my eeePC 900A, my wifi card :
02:00.0 Ethernet controller: Atheros Communications Inc. AR242x 802.11abg Wireless PCI Express Adapter (rev 01)

On this kernel :
Linux Laia 2.6.27.3-27.rc1.fc10.i686 #1 SMP Sat Oct 18 20:35:56 EDT 2008 i686 i686 i386 GNU/Linux

NetworkManager :
0.7.0    0.11.svn4175.fc10             Build Date: dim. 12 oct. 2008 17:25:13 CEST
with the corresponding applet :
Applet NetworkManager 0.7.0

But in a KDE 4.1.2 environnement (I didn't test yet with knetworkmanager).

The problem is the same as described in the first comment, after a stop (suspend or resume), the wifi network which was previously working stopped and impossible to make it work again.
After a suspend, I've got a kernel error :

Kernel failure message 1:
BUG: sleeping function called from invalid context at include/linux/pagemap.h:294
in_atomic():0, irqs_disabled():1
Pid: 5521, comm: hwclock Not tainted 2.6.27.3-27.rc1.fc10.i686 #1
 [<c0425e3b>] __might_sleep+0xc6/0xcb
 [<c046d949>] lock_page+0x15/0x2f
 [<c046db81>] find_lock_page+0x1e/0x3a
 [<c046e055>] filemap_fault+0x96/0x32b
 [<c04799bf>] __do_fault+0x3b/0x2c7
 [<c04def65>] ? journal_dirty_metadata+0x32/0xd6
 [<c047b254>] handle_mm_fault+0x2e0/0x6d1
 [<c0440ef2>] ? sched_clock_cpu+0x12c/0x13b
 [<c051f6f6>] ? rb_insert_color+0x56/0xc0
 [<c043f158>] ? enqueue_hrtimer+0xc1/0xcc
 [<c06c7b4e>] do_page_fault+0x2e0/0x68e
 [<c0420a09>] ? hrtick_start_fair+0x11c/0x153
 [<c040271a>] ? __switch_to+0x114/0x139
 [<c0428f4d>] ? finish_task_switch+0x2f/0xb0
 [<c06c48ef>] ? schedule+0x6ee/0x70d
 [<c0405f33>] ? do_softirq+0xbe/0xdb
 [<c05200d8>] ? strspn+0x24/0x2e
 [<c0496a13>] ? path_put+0x15/0x18
 [<c045f9c4>] ? audit_syscall_exit+0xb2/0xc7
 [<c06c786e>] ? do_page_fault+0x0/0x68e
 [<c06c6142>] error_code+0x72/0x78

I tried stopping network daemon and network manager, removing the module and reloading it, then restating the daemons, it didn't work.

Comment 16 Arthur Pemberton 2008-10-22 16:29:17 UTC
Switching to the 2.6.27 kernel in Rawhide mostly solved my issues. However, wake up was very slow. Either that, or pressing the (hardware) wifi button on the laptop triggered it to wake up. I wasn't able to play with it further as I wanted to return it to it's owner.

Comment 17 Kamil Stawirej 2009-02-16 10:44:43 UTC
I have a similar problem. My wireless connection is not re-established after the machine is resumed from hibernation or suspension. I have a broadcom adapter and I use the kernel-supplied b43 driver. When I wake the computer up I have to do the following to reconnect to my wireless network:

rmmod b43
modprobe b43
service NetworkManager restart

and wait a few seconds before the connection is re-established. The funny thing is that sometimes I don't have to do this after resuming the machine from suspension.

I run fedora 10 with the latest kernel (as of Feb., 16th), with all available updates installed.

Any ideas? Can this be driver related?

Comment 18 Jessica Sterling 2009-03-08 19:40:16 UTC
This bug has been triaged

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 19 Jonathan Pritchard 2009-03-08 20:36:09 UTC
I'm sorry I'm unable to help diagnose this further as the machine I had the problem on is no longer in my posession.

Comment 20 Emanuele Bellini 2009-04-10 17:57:15 UTC
I have a similar problem with a D-Link DWL-G630 atheros-based card. If I try to suspend my computer, I have a kernel panic. If I try to hibernate it, I have that I can't use the card because the Networkmanager says it's not ready.

Comment 21 Emanuele Bellini 2009-04-10 20:07:33 UTC
I saw that removing the DWL-G630 (that is a PCMCIA card) my bug depends on 230675, that is going to be fixed and until that I don't know if I can help you. Anyway I haven't tried the patch above (comment #1), but I'm going to.
I have a Presario 900 series.
If mine is a different one, just tell me and I'm going to post a new report.

Comment 22 Emanuele Bellini 2009-04-10 20:12:44 UTC
*** Bug 478588 has been marked as a duplicate of this bug. ***

Comment 23 Bug Zapper 2009-06-10 01:48:35 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 24 Bug Zapper 2009-07-14 17:53:10 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.