Bug 815091 - wpa_supplicant fails to dbus activate within 25 seconds
wpa_supplicant fails to dbus activate within 25 seconds
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: wpa_supplicant (Show other bugs)
17
All Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Dan Williams
Fedora Extras Quality Assurance
: Regression
: 815856 815888 817605 817635 (view as bug list)
Depends On:
Blocks: F17Blocker/F17FinalBlocker 817857
  Show dependency treegraph
 
Reported: 2012-04-22 08:40 EDT by Heiko Adams
Modified: 2012-05-02 00:53 EDT (History)
31 users (show)

See Also:
Fixed In Version: wpa_supplicant-1.0-0.4.fc17
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-05-02 00:53:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Screenshot of MN popup menu before restarting the service (69.45 KB, image/png)
2012-04-22 17:54 EDT, Heiko Adams
no flags Details
Screenshot of service manager before restarting the service (64.37 KB, image/png)
2012-04-22 17:55 EDT, Heiko Adams
no flags Details
NetworkManager log (4.77 KB, text/x-log)
2012-04-23 09:04 EDT, masami256
no flags Details

  None (edit)
Description Heiko Adams 2012-04-22 08:40:59 EDT
Description of problem:
After applying todays NetworkManager update autoconnecting to my wireless network doesn't work anymore after starting my system. NetworkManager says there are not wireless network present. After restarting the NM service everything works fine until next reboot

Version-Release number of selected component (if apple):
NetworkManager-0.9.4.0-6.git20120403.fc17.x86_64
NetworkManager-vpnc-0.9.3.997-1.fc17.x86_64
NetworkManager-openvpn-0.9.3.997-1.fc17.x86_64
NetworkManager-gtk-0.9.4.0-6.git20120403.fc17.x86_64
NetworkManager-pptp-0.9.3.997-1.fc17.x86_64
NetworkManager-glib-0.9.4.0-6.git20120403.fc17.x86_64
NetworkManager-gnome-0.9.4.0-6.git20120403.fc17.x86_64

How reproducible:
allways

Steps to Reproduce:
1. Configure a wireless network for automatic connecting
2. Restart the system
  
Actual results:
NM doesn't connect to your wireless network until you restart the NM service

Expected results:
NM should connect to the wireless network like in version 0.9.3 and older.

Additional info:
Comment 1 Pavel Šimerda (pavlix) 2012-04-22 16:50:08 EDT
I won't reproduce with:

NetworkManager-0.9.4.0-6.git20120403.fc17.x86_64
kernel-3.3.2-8.fc17.x86_64
Comment 2 Heiko Adams 2012-04-22 17:54:18 EDT
Created attachment 579365 [details]
Screenshot of MN popup menu before restarting the service

This is a screenshot of MN popup menu after booting my system and before restarting the service
Comment 3 Heiko Adams 2012-04-22 17:55:38 EDT
Created attachment 579366 [details]
Screenshot of service manager before restarting the service

This is a screenshot of the service-manager after booting my system and before restarting the NM service
Comment 4 Rex Dieter 2012-04-23 08:54:53 EDT
I'm seeing similar symptoms, no AP's found initially, at least until I manually restart the NM service.
Comment 5 Rex Dieter 2012-04-23 08:56:05 EDT
Forgot to mention in comment #4, but NetworkManager-0.9.4.0-1.git20120328.fc17 didn't have this issue.
Comment 6 masami256 2012-04-23 09:04:43 EDT
Created attachment 579532 [details]
NetworkManager log
Comment 7 masami256 2012-04-23 09:07:06 EDT
I attatched a NetworkManager's log[NetworkManager log] to here.

In the log, I can see this error log. 
9528:Apr 23 21:58:47 rune NetworkManager[781]: <error> [1335185927.573217] [nm-s
upplicant-interface.c:804] interface_add_cb(): (wlan0): error adding interface: 
Did not receive a reply. Possible causes include: the remote application did not
 send a reply, the message bus security policy blocked the reply, the reply time
out expired, or the network connection was broken.
9529:Apr 23 21:58:47 rune NetworkManager[781]: dbus_g_proxy_cancel_call: asserti
on `pending != NULL' failed
Comment 8 Jason Montleon 2012-04-23 12:45:07 EDT
I am seeing the same thing. I am able to work around it by right clicking the NM icon and unchecking "Enable Wireless" and then checking it off again.
Comment 9 Dan Winship 2012-04-23 16:58:53 EDT
(In reply to comment #7)
> Did not receive a reply. Possible causes include: the remote application did
> not
>  send a reply, the message bus security policy blocked the reply, the reply
> time
> out expired, or the network connection was broken.

are there any other messages nearby in the log? (like, dbus indicating that it blocked a reply)
Comment 10 Dan Williams 2012-04-23 17:15:57 EDT
For example, one user had these messages:

    Apr 23 21:22:21 eregion NetworkManager[578]: <error> [1335212541.505187] [nm-supplicant-interface.c:804] interface_add_cb(): (wlan0): error adding interface: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
    Apr 23 21:22:21 eregion NetworkManager[578]: dbus_g_proxy_cancel_call: assertion `pending != NULL' failed
    Apr 23 21:22:21 eregion NetworkManager[578]: <info> (wlan0): supplicant interface state: starting -> down
    Apr 23 21:22:21 eregion NetworkManager[578]: <warn> Trying to remove a non-existant call id.
    Apr 23 21:22:21 eregion dbus-daemon[655]: dbus[655]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out
    Apr 23 21:22:21 eregion dbus[655]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out
    Apr 23 21:22:27 eregion systemd[1]: NetworkManager-wait-online.service: main process exited, code=exited, status=1
    Apr 23 21:22:27 eregion systemd[1]: Unit NetworkManager-wait-online.service entered failed state.
    Apr 23 21:22:27 eregion NetworkManager[578]: <info> wpa_supplicant started
    Apr 23 21:22:27 eregion systemd[1]: iscsi.service: control process exited, code=exited status=3

which indicate that something is going wrong with dbus/systemd service activation and wpa_supplicant.  The code in NM which talks to wpa_supplicant hasn't change in quite a while.
Comment 11 masami256 2012-04-23 19:12:01 EDT
I have following log that seems same as #10's one.
Apr 24 07:56:01 rune dbus[617]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Apr 24 07:56:01 rune dbus-daemon[617]: dbus[617]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Apr 24 07:56:20 rune NetworkManager[711]: <error> [1335221780.242066] [nm-supplicant-interface.c:804] interface_add_cb(): (wlan0): error adding interface: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Apr 24 07:56:20 rune NetworkManager[711]: dbus_g_proxy_cancel_call: assertion `pending != NULL' failed
Apr 24 07:56:20 rune NetworkManager[711]: <info> (wlan0): supplicant interface state: starting -> down
Apr 24 07:56:20 rune NetworkManager[711]: <warn> Trying to remove a non-existant call id.
Apr 24 07:56:20 rune dbus-daemon[617]: dbus[617]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out
Apr 24 07:56:20 rune dbus[617]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out
Apr 24 07:56:26 rune systemd[1]: NetworkManager-wait-online.service: main process exited, code=exited, status=1
Apr 24 07:56:27 rune systemd[1]: Unit NetworkManager-wait-online.service entered failed state.
Apr 24 07:56:28 rune NetworkManager[711]: <info> wpa_supplicant started
Apr 24 07:56:28 rune kernel: [   52.479265] RPC: Registered named UNIX socket transport module.
Comment 12 Dan Winship 2012-04-24 13:08:18 EDT
*** Bug 815856 has been marked as a duplicate of this bug. ***
Comment 13 Maxim Burgerhout 2012-04-24 15:06:02 EDT
Seeing the same here. Booting seems to take an awful lot of time due to this, too. Systemd spends a significant amount of time waiting for the NetworkManager-wait-online.service before it finally fails. From what I understand from 'systemd-analyze chart' something in the order of 30 seconds.

Not fixed in NetworkManager-0.9.4.0-7.git20120403.fc17.x86_64
Comment 14 Kenny Armstrong 2012-04-25 10:15:17 EDT
Same exact problem here.  I am also using NetworkManager-0.9.4.0-7.git20120403.fc17.x86_64.  However, NetworkManager shut off my wireless card when I restarted the service, and it won't come on again until next reboot.
Comment 15 Dan Williams 2012-04-26 10:51:11 EDT
Bootup blocking is a different bug; but in that case your system probably has the 'remote-fs-pre' systemd target enabled and active, which is causing systemd to wait for NetworkManager to bring up a network connection (via NM-wait-online) before continuing, since remote filesystems obviously require network connectivity.

This bug is (at least for now) about the problem activating the supplicant and thus getting wireless devices working correctly.
Comment 16 Dan Williams 2012-04-26 10:53:17 EDT
Note that NM-wait-online was linked to remote-fs-pre.target as a fix for bug 787314
Comment 17 Bill Nottingham 2012-04-26 10:54:57 EDT
Can you attach your fstab?
Comment 18 Dan Williams 2012-04-26 11:09:58 EDT
*** Bug 815888 has been marked as a duplicate of this bug. ***
Comment 19 Dan Williams 2012-04-26 15:20:45 EDT
So for the moment I'm going to toss this to systemd.  Here's why...  dbus-daemon calls systemd to activate the supplicant, and apparently that's timing out.  But I have no idea where the systemd logs are about that process; they're clearly not getting dumped to /var/log/messages.

Apr 26 10:41:47 golem dbus-daemon[660]: dbus[660]: [system] Activating via systemd: service name='fi.w1.wpa_supplicant1' unit='wpa_supplicant.service'
Apr 26 10:41:47 golem dbus[660]: [system] Activating via systemd: service name='fi.w1.wpa_supplicant1' unit='wpa_supplicant.service'
Apr 26 10:42:12 golem dbus[660]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out
Apr 26 10:42:12 golem dbus-daemon[660]: dbus[660]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out

Here dbus is clearly using systemd for the service activation, but we need more log messages from systemd about why the service failed to activate within 25 seconds.  I assume systemd folks will know where to get those.

I'm fairly certain that nothing in wpa_supplicant itself has changed within the past few weeks.
Comment 20 Jiří Martínek 2012-04-28 11:40:45 EDT
Exactly the same here, I "repair" it with 
# systemctl restart NetworkManager.service
till next reboot
Comment 21 Rex Dieter 2012-04-28 13:10:39 EDT
nominating release blocker : networking fail out of the box (could've sworn I'd marked it previously, but perhaps I'd only dreamed it).

marking regression per comment #5, last (nm) build that didn't exhibit this was NetworkManager-0.9.4.0-1.git20120328.fc17
Comment 22 patrick korsnick 2012-04-28 13:25:26 EDT
I have 3 bare metal installs of f17 (Gnome, KDE, and Xfce) and only two, KDE & Xfce, exhibit this behavior. The Gnome version starts wifi and connects just like it always has.

All three are running the same version of NM:

NetworkManager-0.9.4.0-7.git20120403.fc17.x86_64
Comment 23 Dan Williams 2012-04-30 11:23:07 EDT
I couldn't get it to happen with an F17 GNOME Shell install either... :(  I suppose I'll try installing on a slightly faster machine and see if that helps reproduce it.
Comment 24 Mads Kiilerich 2012-04-30 11:59:25 EDT
FWIW: I saw symptoms that could seem similar when I experienced Bug 815331 (ipw2200 regression in reporting of properties cause problems when nm-applet started checking them in 0.9.4) - perhaps some of the comments here are influenced by that.
Comment 25 patrick korsnick 2012-04-30 12:33:57 EDT
(In reply to comment #23)
> I couldn't get it to happen with an F17 GNOME Shell install either... :(  I
> suppose I'll try installing on a slightly faster machine and see if that helps
> reproduce it.

I'm running all 3 installs on a i7-2820QM, which is pretty darn fast, so I'm not sure if machine speed is related. Gnome version works, Xfce and KDE versions exhibit the bug.
Comment 26 cornel panceac 2012-04-30 13:57:29 EDT
I have this problem on Thinkpad X40 with 1.2GB of RAM, PCMCIA wireless card, gnome in fallback mode. CPU scales between 600 and 1200 MHz.
Comment 27 Bill Nottingham 2012-04-30 17:39:31 EDT
Investigation notes:

/usr/share/dbus-1/system-services/fi.w1.wpa_supplicant1.service:
[D-BUS Service]
Name=fi.w1.wpa_supplicant1
Exec=@SBINDIR@/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
User=root
SystemdService=wpa_supplicant.service

/lib/systemd/system/wpa_supplicant.service:
[Unit]
Description=WPA Supplicant daemon
After=syslog.target network.target

[Service]
Type=forking
EnvironmentFile=-/etc/sysconfig/wpa_supplicant
ExecStart=/usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf $INTERFACES $DRIVERS -B -u $OTHER_ARGS

[Install]
WantedBy=multi-user.target


1) Unescaped @SBINDIR@ in the dbus service looks weird.

2) The ExecStart lines are slightly different, although via /etc/sysconfig/wpa_supplicant they end up being the same

3) /lib/systemd/system/wpa_supplicant.service has:
After=network.target
NM has:
Before=network.target

This seems potentially incompatible.

4) Changing wpa_supplicant.service such that (psuedo-diff):

-Type=forking
+Type=dbus
+BusName=fi.w1.wpa_supplicant1

may or may not help.
Comment 28 Rex Dieter 2012-04-30 18:21:31 EDT
Fwiw, bill's workarounds didn't help me (as a matter of fact, after 4, NM wouldn't work at all until I put it back the way it was).

I tried booting into both gnome and kde, and they both still fail equally for me until I manually do:
systemctl restart NetworkManager.service
Comment 29 Bill Nottingham 2012-04-30 22:50:15 EDT
I'm going to guess that everyone who's seeing this has nfs-lock.service enabled.

(Check with systemctl is-enabled nfs-lock.service.)

What we have:

nfs-lock.service:
Wants=remote-fs-pre.target
Before=remote-fs-pre.target

remote-fs-pre.target.wants has:
NetworkManager-wait-online.service:
After=NetworkManager.service
Before=network.target

wpa_supplicant.service has:
After=syslog.target network.target

nfs-lock.service enables the remote-fs-pre.target, which brings in NM-wait-online.

NM-wait-online needs done before network.target is reached, so it waits 30 seconds for NM to connect before network.target is considered done.

wpa_supplicant is After: network.target, so systemd waits for NM-wait-online to finish before it starts it. If NM-wait-online waits more than NM's specified timeout to wait for wpa_supplicant... it won't connect to it.

I see two bugs here:
- nfs-lock.service should not Want remote-fs-pre.target - that is automatically pulled in by network filesystems where necessary.
- wpa_supplicant.service should not be After network.target. If anything, it should be Before...

Fixing either of these would fix the issue where NM & wpa_supplicant can't talk to each other, but I think we should fix both.

Leaving assigned to systemd for a short bit for confirmation that I've debugged this right.
Comment 30 Bill Nottingham 2012-04-30 22:50:29 EDT
*** Bug 817635 has been marked as a duplicate of this bug. ***
Comment 31 Bill Nottingham 2012-04-30 22:52:09 EDT
*** Bug 817605 has been marked as a duplicate of this bug. ***
Comment 32 Heiko Adams 2012-05-01 04:56:42 EDT
I can confirm that nfs-lock is enabled on my machine
Comment 33 cornel panceac 2012-05-01 06:47:11 EDT
i confirm that nfs-lock.service is enabled on my affected system.
Comment 34 cornel panceac 2012-05-01 06:56:09 EDT
Unfortunately, disabling nfs-lock.service didn't help wireless work, after reboot.
Comment 35 Christoph Wickert 2012-05-01 07:09:10 EDT
I think it's NetworkManager and not systemd.

I first downgraded Networkmanager and rebootet, but the error was still present. I then downgraded NetworkManager to 0.9.4.0-1.git20120328.fc17. On reboot, the problem did not occur again but wifi devices were present right from the start. I then upgraded systemd to 44-7.fc17 again and rebooted one more time. Still everything is fine.
Comment 36 Timothy Davis 2012-05-01 10:07:36 EDT
I had this problem until I removed NetworkManager.service from /etc/systemd/system/multi-user.target.wants

System boots faster and wireless works OOTB
Comment 37 Dan Williams 2012-05-01 10:33:55 EDT
(In reply to comment #35)
> I think it's NetworkManager and not systemd.
> 
> I first downgraded Networkmanager and rebootet, but the error was still
> present. I then downgraded NetworkManager to 0.9.4.0-1.git20120328.fc17. On
> reboot, the problem did not occur again but wifi devices were present right
> from the start. I then upgraded systemd to 44-7.fc17 again and rebooted one
> more time. Still everything is fine.

NM can hack around the problem of wpa_supplicant startup by retrying the AddInterface() call a few times or something like that, but that's a hack and a workaround.  I'd *also* like to figure out why the problem happens in the first place, since covering it up will simply cause problems later.  It seems like Bill's poked around and figured that out.

We should still be getting more information from systemd about why the service isn't starting in time though, which would have helped determine the cause earlier.
Comment 38 Dan Williams 2012-05-01 12:28:05 EDT
Moving to wpa_supplicant since we'll fix the ordering issues there that are causing these problems.
Comment 39 Fedora Update System 2012-05-01 12:54:57 EDT
wpa_supplicant-1.0-0.4.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/wpa_supplicant-1.0-0.4.fc17
Comment 40 Dan Williams 2012-05-01 13:00:42 EDT
Anyone hitting this bug, please install the wpa_supplicant update and restart, then check your /var/log/messages for the systemd dbus activation failure message for the supplicant during restart. Thanks!
Comment 41 Heiko Adams 2012-05-01 13:08:32 EDT
Seems to be fixed. Autoconnecting to wireless networks works fine :-)
Comment 42 Jason Montleon 2012-05-01 13:16:05 EDT
Update seems to have fixed it.
Comment 43 Fedora Update System 2012-05-02 00:53:32 EDT
wpa_supplicant-1.0-0.4.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.