Bug 871122

Summary: NetworkManager sometimes fails to start on boot on Fedora ARM
Product: [Fedora] Fedora Reporter: Jeff Bastian <jbastian>
Component: NetworkManagerAssignee: Dan Williams <dcbw>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 18CC: danw, dcbw, jklimes
Target Milestone: ---   
Target Release: ---   
Hardware: arm7   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-01 09:46:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 245418    
Attachments:
Description Flags
/var/log/messages from failed boot
none
/var/log/messages from successful boot none

Description Jeff Bastian 2012-10-29 16:59:20 UTC
Description of problem:
While testing Fedora 18 Alpha RC1 on Friday for the Fedora ARM VFAD [1], I noticed that sometimes NetworkManager failed to start correctly on boot.  I had to login on the serial console and restart NM to bring up the network.  See the console transcript below.

Sometimes it works fine on boot, others it fails.  It might be a timing issue with systemd and NetworkManager on the ARM platform?

[1] https://fedoraproject.org/wiki/Architectures/ARM/Quality_Assurance/2012-10-26-VFAD-Fedora_18_Test_Day

Version-Release number of selected component (if applicable):
NetworkManager-0.9.7.0-6.git20121004.fc18.armv7hl
systemd-194-1.fc18.armv7hl
kernel-omap-3.6.3-3.fc18.armv7hl

How reproducible:
sometimes (maybe 20% of the time?)

Steps to Reproduce:
1. boot Fedora 18 Alpha RC1 on a Pandaboard
2. login on serial console
3. ip a l eth0
4. systemctl status NetworkManager.service
  
Actual results:
No IPv4 address for eth0, and systemctl reports NetworkManager failed:

NetworkManager.service - Network Manager
          Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled)
          Active: failed (Result: exit-code) since Fri, 31 Dec 1999 19:00:44 -0500; 9min ago
         Process: 469 ExecStart=/usr/sbin/NetworkManager --no-daemon (code=exited, status=1/FAILURE)
          CGroup: name=systemd:/system/NetworkManager.service

Expected results:
NetworkManager starts correctly on every boot

Additional info:
Serial console transcript.  Ignore the timestamps; the Pandaboard does not have a battery backed clock so it always boots as January 1, 2000 00:00 (or December 31, 1999 19:00 in US Eastern time zone).

Fedora release 18 (Spherical Cow)
Kernel 3.6.3-3.fc18.armv7hl.omap on an armv7l (ttyO2)

panda-f18-v7hl login: root
Password: 
Last login: Mon Oct 29 12:33:02 on ttyO2
[root@panda-f18-v7hl ~]# ip a l eth0
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 3a:92:41:98:8e:95 brd ff:ff:ff:ff:ff:ff

[root@panda-f18-v7hl ~]# systemctl status NetworkManager.service
NetworkManager.service - Network Manager
          Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled)
          Active: failed (Result: exit-code) since Fri, 31 Dec 1999 19:00:44 -0500; 9min ago
         Process: 469 ExecStart=/usr/sbin/NetworkManager --no-daemon (code=exited, status=1/FAILURE)
          CGroup: name=systemd:/system/NetworkManager.service

Dec 31 19:00:43 panda-f18-v7hl NetworkManager[469]: <info> Loaded plugin keyfile: (c) 2007 - 2010 Red Hat, Inc.  To report bugs please use the NetworkManager mailing list.
Dec 31 19:00:43 panda-f18-v7hl NetworkManager[469]: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Dec 31 19:00:43 panda-f18-v7hl NetworkManager[469]: ifcfg-rh:     read connection 'System eth0'
Dec 31 19:00:43 panda-f18-v7hl NetworkManager[469]: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Dec 31 19:00:44 panda-f18-v7hl NetworkManager[469]: <warn> NameHasOwner request failed: Disconnected from D-Bus (or argument error during call)
Dec 31 19:00:44 panda-f18-v7hl NetworkManager[469]: <info> monitoring kernel firmware directory '/lib/firmware'.
Dec 31 19:00:44 panda-f18-v7hl NetworkManager[469]: <warn> failed to request default Bluetooth adapter.
Dec 31 19:00:44 panda-f18-v7hl NetworkManager[469]: <warn> NameHasOwner request failed: Disconnected from D-Bus (or argument error during call)
Dec 31 19:00:44 panda-f18-v7hl NetworkManager[469]: <warn> NameHasOwner request failed: Disconnected from D-Bus (or argument error during call)
Dec 31 19:00:45 panda-f18-v7hl systemd[1]: Failed to start Network Manager.

[root@panda-f18-v7hl ~]# systemctl restart NetworkManager.service
[root@panda-f18-v7hl ~]# ip a l eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 3a:21:74:d3:48:6a brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.82/24 brd 192.168.1.255 scope global eth0
    inet6 fe80::3821:74ff:fed3:486a/64 scope link 
       valid_lft forever preferred_lft forever

[root@panda-f18-v7hl ~]# systemctl status NetworkManager.service
NetworkManager.service - Network Manager
          Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled)
          Active: active (running) since Fri, 31 Dec 1999 19:10:45 -0500; 12 years and 9 months ago
        Main PID: 912 (NetworkManager)
          CGroup: name=systemd:/system/NetworkManager.service
                  ��├ 912 /usr/sbin/NetworkManager --no-daemon
                  ��└ 927 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-clie...

Dec 31 19:10:51 panda-f18-v7hl NetworkManager[912]: <info>   nameserver '192.168.1.254'
Dec 31 19:10:51 panda-f18-v7hl NetworkManager[912]: <info>   domain name 'gateway.2wire.net'
Dec 31 19:10:51 panda-f18-v7hl NetworkManager[912]: <info> Activation (eth0) Stage 5 of 5 (IPv4 Configure Commit) scheduled...
Dec 31 19:10:51 panda-f18-v7hl NetworkManager[912]: <info> Activation (eth0) Stage 5 of 5 (IPv4 Commit) started...
Dec 31 19:10:51 panda-f18-v7hl dhclient[927]: bound to 192.168.1.82 -- renewal in 37194 seconds.
Dec 31 19:10:52 panda-f18-v7hl NetworkManager[912]: <info> (eth0): device state change: ip-config -> secondaries (reason 'none') [70 90 0]
Dec 31 19:10:52 panda-f18-v7hl NetworkManager[912]: <info> Activation (eth0) Stage 5 of 5 (IPv4 Commit) complete.
Dec 31 19:10:52 panda-f18-v7hl NetworkManager[912]: <info> (eth0): device state change: secondaries -> activated (reason 'none') [90 100 0]
Dec 31 19:10:53 panda-f18-v7hl NetworkManager[912]: <info> Policy set 'System eth0' (eth0) as default for IPv4 routing and DNS.
Dec 31 19:10:53 panda-f18-v7hl NetworkManager[912]: <info> Activation (eth0) successful, device activated.

Comment 1 Jirka Klimes 2012-10-30 09:56:19 UTC
Dec 31 19:00:44 panda-f18-v7hl NetworkManager[469]: <warn> NameHasOwner request failed: Disconnected from D-Bus (or argument error during call)

indicates a problem connecting to D-Bus (possible a race condition is starting services).

Is there a longer log in /var/log/mesages or even a crash?

Comment 2 Jeff Bastian 2012-10-30 14:11:31 UTC
Created attachment 635601 [details]
/var/log/messages from failed boot

Attached is /var/log/messages from a boot where NetworkManager failed to start.

Some of the interesting messages:

Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: dbus[412]: [system] Failed to activate service 'org.freedesktop.PolicyKit1': timed out
Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: dbus[412]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
...
Dec 31 19:00:57 panda-f18-v7hl NetworkManager[528]: <error> [946684857.636242] [nm-manager-auth.c:87] pk_authority_get(): Failed to initialize PolicyKit: (20) Error initializing authority: Error calling StartServiceByName for org.freedesktop.PolicyKit1: GDBus.Error:org.freedesktop.DBus.Error.TimedOut: Activation of org.freedesktop.PolicyKit1 timed out

Comment 3 Jeff Bastian 2012-10-30 14:20:07 UTC
Created attachment 635603 [details]
/var/log/messages from successful boot

I rebooted and it started successfully this time.  Attached is the log from the successful boot for comparison.  This time PolicyKit did not time out:

Dec 31 19:00:20 panda-f18-v7hl dbus-daemon[427]: dbus[427]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'
Dec 31 19:00:20 panda-f18-v7hl dbus[427]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'
Dec 31 19:00:28 panda-f18-v7hl dbus-daemon[427]: dbus[427]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'
Dec 31 19:00:28 panda-f18-v7hl dbus[427]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'

Comment 4 Jeff Bastian 2012-10-30 14:39:22 UTC
My logs from comment 2 look similar to bug 708866 comment 8 which talks about ConsoleKit

I thought ConsoleKit was going away and sytemd's loginctl was taking over?
https://fedoraproject.org/wiki/Features/ckremoval

Comment 5 Jeff Bastian 2012-10-30 14:43:05 UTC
Apparently ConsoleKit is installed because lightdm needs it, and the Xfce spin uses lightdm instead of gdm.  So this is intentional.

Comment 6 Jirka Klimes 2012-11-01 09:45:02 UTC
This all looks like a bug in dbus/systemd activation and it's probably the same as your mentioned bug 708866.

from comment #2's log:
....
Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: dbus[412]: [system] Failed to activate service 'org.freedesktop.PolicyKit1': timed out
Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: dbus[412]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: process 412: arguments to dbus_message_set_reply_serial() were incorrect, assertion "reply_serial != 0" failed in file dbus-message.c line 1070.
Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: This is normally a bug in some application using the D-Bus library.
Dec 31 19:00:57 panda-f18-v7hl dbus-daemon[412]: D-Bus not built with -rdynamic so unable to print a backtrace
....

There's upstream D-Bus bug on this:
https://bugs.freedesktop.org/show_bug.cgi?id=50199

Comment 7 Jirka Klimes 2012-11-01 09:46:06 UTC

*** This bug has been marked as a duplicate of bug 708866 ***