Bug 627279 - NetworkManager after update to 0.8.1-4.git20100817.fc13 no longer restores connection on thaw from hibernate
Summary: NetworkManager after update to 0.8.1-4.git20100817.fc13 no longer restores co...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Dan Williams
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 552506 628877 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-25 14:34 UTC by Andrew Duggan
Modified: 2013-02-22 02:28 UTC (History)
17 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-09-01 01:56:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
NetworkManager debug log (25.43 KB, text/plain)
2010-10-13 22:53 UTC, Ben Caradoc-Davies
no flags Details


Links
System ID Private Priority Status Summary Last Updated
FreeDesktop.org 30701 0 None None None Never

Description Andrew Duggan 2010-08-25 14:34:01 UTC
Description of problem:

After each hibernate and thaw a manual service NetworkManager restsart needs to be done to get network operational.

Version-Release number of selected component (if applicable):

0.8.1-4.git20100817.fc13

How reproducible:
Every time

Steps to Reproduce:
1. Hibernate
2. Thaw
3. 
  
Actual results:
nm-applet shows disconnected, both left and right click menus are blank or say network disabled.

Expected results:

NetworkManager wakes up as it did before yesterday's update

Additional info:

Nothing remarkable in /var/log/messages

Comment 1 Jeff Guerdat 2010-08-25 18:24:29 UTC
Same problem here.  I also note NetworkManager seems to have a hard time starting when booting/logging in.  Backdating to previous version (NetworkManager*-0.8.1-0.1.git20100510.fc13.i686) works fine.

Comment 2 Andrew Duggan 2010-08-26 12:07:59 UTC
Let me revise,

This is really only happening about 50% of the time.  I've had 3 out of 6 work OK since the update to this version of NM.  The three that did work happened when the machine was only hibernating for less than about an hour, the others longer (not that I have anyway of knowing that really has an impact, which I doubt, it is just the case at least ATM). The right click menu still allows editing of connections and "About", but left click menu only has one gray item of "Networking Disabled"

manually from root doing a 

dbus-send --system                       \
  --dest=org.freedesktop.NetworkManager \
  /org/freedesktop/NetworkManager       \
  org.freedesktop.NetworkManager.wake

had no effect, so it doesn't seem to be a failure of the pm-utils, and since it sometimes works, it doesn't seem to be dbus either.

Also even after a restart of the NM service, although it works, the two right click nm-applet options of "Enable Networking" and "Enable Wireless" with their checkboxes are grayed out.  So that is a bug too.

Comment 3 Roman Kagan 2010-08-27 05:02:03 UTC
Same here on suspend/resume.

Can be reproduced with dbus-send above.  NM doesn't react to 50+% of sleep/wake messages.

Comment 4 Roman Kagan 2010-08-27 05:36:57 UTC
Looks like this is the same problem that was addressed in upstream commit d3b26a9c57e81b1021d30bbaf94757d337cfc4fa.

stracing NetworkManager shows that the sleep/wake messages are ignored when NM fails to obtain UID of the client from dbus.

Calling sleep/wake from a long-living process (interactive python session) is reliable.

Comment 5 Stephen Haffly 2010-08-28 18:57:50 UTC
Downgrading NetworkManager, NetworkManager-glib and NetworkManager-gnome to the 0.8.1-0.1.git20100510.fc13.i686 version restores normal operation.

Comment 6 Ben Caradoc-Davies 2010-08-29 07:45:24 UTC
This bug also affects Fedora 12 NetworkManager-0.8.1-3.git20100813.fc12.i686.

Comment 7 cam 2010-08-29 10:10:11 UTC
I am seeing this problem with NetworkManager-0.8.1-4.git20100817.fc13.i686

intermittent failure to resume networking.

I use 'service NetworkManager restart' and if that does not make it work, use the rf-kill switch to disable and re-enable wifi, this usually seems to bring it back to life. The rf-kill switch alone is not enough to wake networking.

I can never remember the nmcli command when I need it!

When reproduced, the state file shows:

cat /var/lib/NetworkManager/NetworkManager.state

[main]
NetworkingEnabled=true
WirelessEnabled=true
WWANEnabled=true

nmcli shows:

[root@Newt ~]# nmcli nm
RUNNING         STATE           WIFI-HARDWARE   WIFI       WWAN-HARDWARE   WWAN      
running         asleep          enabled         enabled    enabled         enabled   

At this point the GUI shows 'Networking Disabled' which is a bit misleading. A shame because the internal state shows something that makes more sense.

If there is a possibility that this condition can be reached by users (even through bugs in other components perhaps) it would be great if the GUI gave the information that NM was asleep and gave the option to wake it.

Comment 8 cam 2010-08-29 12:55:25 UTC
Just to add that nmcli didn't help, this perhaps shows some confusion internally in NM?

[root@Newt ~]# nmcli nm wakeup

** (process:7100): WARNING **: Error enabling/disabling networking: Already enabled

Comment 9 Jirka Klimes 2010-08-30 13:29:00 UTC
On suspend/resume functions from /usr/lib64/pm-utils/sleep.d/55NetworkManager are called to put NM to sleep/awake state.
There is probably a raise condition in dbus causing that the calls are not successful.
Could you please add '--print-reply' to the dbus-send commands and report whether it helps, see https://bugzilla.redhat.com/show_bug.cgi?id=552506#c4

Also please grab /var/log/messages from suspend/resume cycle to see errors.

BTW: suspend/resume works for me even without adding '--print-reply' with:
NetworkManager-0.8.1-4.git20100817.fc13.x86_64
dbus-1.2.24-1.fc13.x86_64
pm-utils-1.2.6.1-1.fc13.x86_64

Comment 10 Roman Kagan 2010-08-30 21:05:28 UTC
As I wrote in comment #4, there is indeed a race condition between the exit of the client (dbus-send) and the query of the client's uid from the NM: GetConnectionUnixUser() method returns NameHasNoOwner error with description "Could not get UID of name ':1.1197': no such name".

And you don't need to test with real suspend/resume; calling dbus-send ...sleep/wake reliably reproduces the problem in several iterations.

The reason it works for you is that most probably you test with a powerful SMP machine where this race is won by NM.  It works for me on a Core2Duo 2.6GHz machine too, but fails on an Atom N280 netbook quite often.

And yes, adding --print-reply to dbus-send, or calling those methods from an interactive python session (both making sure the client lives beyond the uid check) makes things work.

AFAICT the problem appeared on 0.8.1-1 -> 0.8.1-4 transition becase the sleep/wake is now separate from enable/disable, and is a privileged operation now, requiring the client to be called by root.

My understanding is that the upstream commit d3b26a9c57e81b1021d30bbaf94757d337cfc4fa on master and c4ec5dd023b872aa865496aa4cb8b3d8858d458b on NM_0_8 must have fixed that.  Time to pull from upstream?

Comment 11 cam 2010-08-30 23:01:59 UTC
I've added --print-reply to both dbus_send lines and now I can't reproduce the problem.

Comment 12 Stephen Haffly 2010-08-31 02:04:53 UTC
Likewise, adding --print-reply to both lines looks to have fixed it here also.  This is on an Acer Aspire One D250.

Comment 13 Dan Williams 2010-09-01 01:56:31 UTC
I've also just pushed a fix to NM that makes NM just listen to UPower for events instead of relying on pm-utils.  This bug should definitely be fixed in updates-testing (no thanks to pm-utils) with:

https://admin.fedoraproject.org/updates/NetworkManager-0.8.1-6.git20100831.fc13
https://admin.fedoraproject.org/updates/NetworkManager-0.8.1-6.git20100831.fc14
https://admin.fedoraproject.org/updates/NetworkManager-0.8.1-6.git20100831.fc12

Comment 14 Dan Williams 2010-09-01 02:06:05 UTC
*** Bug 552506 has been marked as a duplicate of this bug. ***

Comment 15 Dan Williams 2010-09-01 02:06:22 UTC
*** Bug 628877 has been marked as a duplicate of this bug. ***

Comment 16 Andrew Duggan 2010-09-01 02:13:04 UTC
Thanks Dan - as usual, you rock.

Comment 17 Jeff Guerdat 2010-09-08 15:30:40 UTC
Just updated to the latest today (NetworkManager-0.8.1-6.git20100831.fc13.i686). Same problem here - won't connect automatically during boot/login, wants to create a new configuration instead of using existing one, have to manually get networking restored after suspend.  Backdating to 1:NetworkManager-0.8.1-0.1.git20100510.fc13.i686 works fine.

Comment 18 Kieran Clancy 2010-09-14 22:21:49 UTC
Based on reading bug 477964, I thought this dbus race was supposed to be fixed by the following commit:
http://cgit.freedesktop.org/dbus/dbus/commit/?id=87ddff6b24d9b9d4bba225c33890db25022d8cbe

Patching NetworkManager to use UPower is one thing, but doesn't it concern anyone that the underlying dbus race is still around? DBus is a core part of the system and needs to be dependable. Should we open a new bug for this dbus issue? (I really don't know the terminology, so I don't think I'm the best person to do it.)

Comment 19 Jason Farrell 2010-09-23 20:08:13 UTC
Seeing this bug with NetworkManager-0.8.1-6.git20100831.fc13.x86_64

Comment 20 Jirka Klimes 2010-09-24 07:45:24 UTC
(In reply to comment #18)
> Based on reading bug 477964, I thought this dbus race was supposed to be fixed
> by the following commit:
> http://cgit.freedesktop.org/dbus/dbus/commit/?id=87ddff6b24d9b9d4bba225c33890db25022d8cbe
> 
> Patching NetworkManager to use UPower is one thing, but doesn't it concern
> anyone that the underlying dbus race is still around? DBus is a core part of
> the system and needs to be dependable. Should we open a new bug for this dbus
> issue? (I really don't know the terminology, so I don't think I'm the best
> person to do it.)

I agree that the D-Bus race should be fixed. However, D-Bus developers would need steps to reproduce. So go ahead if you are able to reproduce/describe problem.

(In reply to comment #19)
> Seeing this bug with NetworkManager-0.8.1-6.git20100831.fc13.x86_64

Do you have upower package properly installed? Could you try to reproduce the problem with NetworkManager running with debugging log level from command line? 

# service NetworkManager stop
# NetworkManager --no-daemon --log-level=DEBUG

Comment 21 Jason Farrell 2010-09-25 14:25:07 UTC
FWIW, I can no longer reproduce after adding the '--print-reply' workaround to the dbus-send commands.

upower-0.9.5-1.fc13.x86_64 was already installed. When I get time, if there's no other testing being done(?), I'll revert the workaround, add debug, and try to reproduce again. For now, the laptop needs to resume from suspend reliably for my wife's sake, or she'll boot back to Win7.

Comment 22 Jon Senior 2010-09-27 06:02:12 UTC
I get the same issue on Fedora 12 (The machine is used for work, so I can't test Fedora 13), but occurs with the release NetworkManager-1:0.8.1-6.git20100831.fc12.i686. I can restart NetworkManager from the system-services program and the network comes straight back. I can't get either yum of PackageKit to find / install the upower package on Fedora 12 so I can't report on what effect it had. If I get time later I'll attempt to produce a NM log as described above.

Comment 23 Ben Caradoc-Davies 2010-10-13 22:13:18 UTC
Confirmed for NetworkManager-0.8.1-6.git20100831.fc12.i686. Also affects the latest x86_64 for F12 on another machine.

Comment 24 Ben Caradoc-Davies 2010-10-13 22:53:57 UTC
Created attachment 453340 [details]
NetworkManager debug log

Attached log shows NetworkManager debug log (slightly sanitised) during the failure. I have inserted obvious comments denoting the appearance of the bug. There are several suspend-resume cycles, the last of which causes the bug.

There is no upower for F12.

Comment 25 Ben Caradoc-Davies 2010-10-13 22:57:41 UTC
Why is this bug closed? F12 is still a supported release. This bug is an end-user showstopper. Few users would be able to work around this bug without rebooting. This bug affects Atom netbooks and I have even seen it on a Core laptop. It is also reported in F13. Is there any workaround for F12?

Comment 26 Ben Caradoc-Davies 2010-10-13 23:15:55 UTC
I can confirm that adding  --print-reply to the dbus_send lines in /usr/lib/pm-utils/sleep.d/55NetworkManager fixes the failure in ten out of ten suspend-resume trials (on an Atom N270). Is is possible to roll this workaround out to F12 users?

Comment 27 Jirka Klimes 2010-10-14 07:27:10 UTC
(In reply to comment #25)
> Why is this bug closed? F12 is still a supported release.
This bug is for F13. And comment #13 fixes that by means of UPower

> This bug affects Atom netbooks and I have even seen it on a Core
> laptop. It is also reported in F13. Is there any workaround for F12?
You've answered yourself in comment #26. Adding '--print-reply' works both for F12 and F13. There is an upstream bug to fix this in pm-utils:
https://bugs.freedesktop.org/show_bug.cgi?id=30701

When the bug appears, you can wake NM up with (as root):
dbus-send --system --print-reply --dest=org.freedesktop.NetworkManager /org/freedesktop/NetworkManager org.freedesktop.NetworkManager.Sleep boolean:false

Comment 28 Jeff Guerdat 2010-10-14 14:24:53 UTC
These don't work for me on F13 x86, neither the '--print-reply' nor the root wakeup.  I have to restart NM to make things work and still get a passphrase prompt which will create yet another configuration which isn't needed.

Comment 29 Jeff Guerdat 2010-11-04 19:33:39 UTC
Update - I installed F14 x86 fresh.  The kernel-supplied rt2800pci driver causes major problems when trying to suspend/hibernate/shutdown (the system would freeze shortly after starting one of these actions - a bug report has been opened).  After way too much mucking about in things I don't know about, I installed the RPMfusion rt2860 driver that I had been using in F13 and I got suspend/hibernate/shutdown to work properly.  However, the same problem exists - no network on resume without restarting NM.  I found that I had to configure the built-in ipw2200 interface even though I wasn't using it which gets me past the multiple passphrase prompts issue.  That didn't help resume, however.  I just created a file in /etc/pm/sleep.d and named it 02NetworkManager so it's about the last thing to run on resume.  The contents are:

#!/bin/bash
case $1 in
    hibernate)
        echo "Hibernate - NM!"
        ;;
    suspend)
        echo "Suspend - NM!"
        ;;
    thaw)
        /sbin/service NetworkManager restart
        ;;
    resume)
        /sbin/service NetworkManager restart
        ;;
    *)  echo "somebody is calling me totally wrong."
        ;;
esac

This is based on the page found here:

https://wiki.archlinux.org/index.php/Pm-utils#Creating_your_own_hooks

Now resume gives me a network without having to resort to manually restarting NM.  So, the question is, why doesn't /usr/lib/pm-utils/sleep.d/55NetworkManager do this for me?  Wrong place?  Wrong strategy?  I ass-u-me that I'm just masking the problem with this "fix"...


Note You need to log in before you can comment on or make changes to this bug.