Bug 1093942 - NetworkManager-wait-online.service takes too long
Summary: NetworkManager-wait-online.service takes too long
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Lubomir Rintel
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: WhyWeBootSoSlow
TreeView+ depends on / blocked
 
Reported: 2014-05-03 18:20 UTC by Chris Murphy
Modified: 2016-07-19 11:26 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-07-19 11:26:54 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg (43.59 KB, text/plain)
2014-05-03 18:24 UTC, Chris Murphy
no flags Details
journal (115.47 KB, text/plain)
2014-05-03 18:35 UTC, Chris Murphy
no flags Details
journal with systemd.log_level=debug (631.87 KB, text/plain)
2014-05-03 18:44 UTC, Chris Murphy
no flags Details
journalctl with NM debug (199.68 KB, text/plain)
2014-05-06 03:26 UTC, Chris Murphy
no flags Details
journalctl with NM debug and ignore-carrier (193.71 KB, text/plain)
2014-05-06 03:34 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2014-05-03 18:20:38 UTC
Description of problem: Nearly 6 seconds for NetworkManager-wait-online.service seems excessive, especially for a VM.


Version-Release number of selected component (if applicable):
NetworkManager-0.9.9.1-5.git20140319.fc21.x86_64

How reproducible:
Every boot


Steps to Reproduce:
1. Boot


Actual results:
# systemd-analyze blame
          5.764s NetworkManager-wait-online.service
          1.198s firewalld.service
           692ms vboxadd.service
           528ms ModemManager.service
           524ms vboxadd-x11.service
           486ms systemd-logind.service
           484ms abrt-ccpp.service
           482ms lvm2-monitor.service
           469ms chronyd.service
           432ms mcelog.service
           424ms nfs-config.service
           421ms rsyslog.service
           387ms avahi-daemon.service
           340ms libvirtd.service
           254ms plymouth-start.service
           213ms NetworkManager.service
           120ms vboxadd-service.service
           106ms systemd-fsck@dev-disk-by\x2duuid-D4A4\x2dEE15.service
           103ms systemd-fsck@dev-disk-by\x2duuid-ccfb8f09\x2df0e2\x2d40b1\x2da899\x2d4e758d58b590.service
            96ms systemd-udev-trigger.service
            81ms systemd-sysctl.service
            69ms sys-kernel-debug.mount
            67ms systemd-vconsole-setup.service
            60ms kmod-static-nodes.service
            58ms systemd-tmpfiles-setup.service
            57ms plymouth-quit-wait.service
            55ms systemd-tmpfiles-setup-dev.service
            53ms var-lib-nfs-rpc_pipefs.mount
            50ms polkit.service
            50ms plymouth-quit.service
            48ms fedora-import-state.service
            48ms dev-mqueue.mount
            44ms boot-efi.mount
            39ms dev-hugepages.mount
            38ms fedora-readonly.service
            36ms sshd.service
            36ms user
            32ms home.mount
            31ms var.mount
            30ms plymouth-read-write.service
            26ms auditd.service
            26ms systemd-remount-fs.service
            21ms tmp.mount
            20ms systemd-journal-flush.service
            19ms boot.mount
            18ms systemd-random-seed.service
            17ms systemd-udevd.service
            15ms systemd-update-utmp.service


Expected results:

Certainly much less than 5 seconds.


Additional info:
VirtualBox 4.3.10

Comment 1 Chris Murphy 2014-05-03 18:24:03 UTC
Created attachment 892173 [details]
dmesg

3.15.0-0.rc3.git0.1.fc21.x86_64

Comment 2 Chris Murphy 2014-05-03 18:35:15 UTC
Created attachment 892174 [details]
journal

kernel-3.15.0-0.rc3.git0.1.fc21.x86_64

These things seem suspicious, but maybe aren't at all related to the problem.

1.
Why is NetworkManager seemingly causing the kernel to load cfg80211? This is a VM and doesn't have wireless hardware.

[    6.034200] rawhide.localdomain systemd[1]: Starting Network Manager...
[    6.111313] rawhide.localdomain NetworkManager[559]: <info> NetworkManager (version 0.9.9.1-5.git20140319.fc21) is starting...
[    6.111672] rawhide.localdomain NetworkManager[559]: <info> Read config: /etc/NetworkManager/NetworkManager.conf
[    6.112205] rawhide.localdomain NetworkManager[559]: <info> WEXT support is enabled
[    6.139399] rawhide.localdomain kernel: cfg80211: Calling CRDA to update world regulatory domain


2.
There's a large time gap between NetworkManager startup complete, and the start of NetworkManager-wait-online.

[    8.708717] rawhide.localdomain avahi-daemon[426]: Registering new address record for fe80::a00:27ff:fee3:26e3 on enp0s3.*.
[   12.005997] rawhide.localdomain NetworkManager[559]: <info> startup complete
[   12.017118] rawhide.localdomain systemd[1]: Started Network Manager Wait Online.

Comment 3 Chris Murphy 2014-05-03 18:44:43 UTC
Created attachment 892175 [details]
journal with systemd.log_level=debug

Nearly one second gap:
[   18.626906] rawhide.localdomain NetworkManager[557]: <info> Activation (enp0s3) Stage 2 of 5 (Device Configure) complete.
[   18.633364] rawhide.localdomain systemd[625]: Executing: /usr/bin/nm-online -q --timeout=30
[   19.418640] rawhide.localdomain NetworkManager[557]: <info> Activation (enp0s3) Stage 3 of 5 (IP Configure Start) scheduled.

Nearly 4 second gap:
[   19.518454] rawhide.localdomain dbus-daemon[442]: dbus[442]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
[   19.523738] rawhide.localdomain systemd[1]: Started Network Manager Script Dispatcher Service.
[   20.003527] rawhide.localdomain ModemManager[429]: <warn>  Couldn't find support for device at '/sys/devices/pci0000:00/0000:00:03.0': not supported by any plugin
[   20.445155] rawhide.localdomain avahi-daemon[425]: Registering new address record for fe80::a00:27ff:fee3:26e3 on enp0s3.*.
[   24.006218] rawhide.localdomain NetworkManager[557]: <info> startup complete
[   24.023384] rawhide.localdomain systemd[1]: Started Network Manager Wait Online.
[   24.024888] rawhide.localdomain systemd[1]: Starting Network is Online.
[   24.025352] rawhide.localdomain systemd[1]: Reached target Network is Online.


Is NetworkManager-wait-online.service somehow getting hit with an Avahi startup delay? systemd-analyze blame puts it at
315ms avahi-daemon.service


By far the longest pause in console during startup is the last line in this sequence:

[  OK  ] Started Network Manager.
         Starting Network Manager Wait Online...
         Starting Network Manager Script Dispatcher Service...
[  OK  ] Started Network Manager Script Dispatcher Service.

There is about a 1 second pause at "Started Network Manager" and fully 8 seconds of pause at "Started Network Manager Script Dispatcher Service." And I cannot login locally or remotely until this clears.

Comment 4 Jirka Klimes 2014-05-05 13:22:02 UTC
I don't seem to be able to reproduce the problem on a physical machine.

The 4 second delay before "<info> startup complete" looks suspicious to me. It could be a timeout inside NM waiting for some event. It might be related to carried detection and the virtual driver. You may try setting
ignore-carrier=* in NetworkManager.conf to see whether it helps.

In any case it would be helpful to see debugging logs from NM. Can you put these lines to NetworkManager.conf and reboot:
[logging]
level=DEBUG
domains=ALL

Comment 5 Chris Murphy 2014-05-06 03:26:55 UTC
Created attachment 892752 [details]
journalctl with NM debug

Comment 6 Chris Murphy 2014-05-06 03:34:25 UTC
Created attachment 892755 [details]
journalctl with NM debug and ignore-carrier

NetworkManager.conf currently reads:

[main]
plugins=ifcfg-rh
ignore-carrier=*
[logging]
level=DEBUG
domains=ALL

Comment 7 Chris Murphy 2014-05-13 04:10:39 UTC
In the same VM, but a new virtual disk, I've clean installed Rawhide minimal system from netinstall media. The problem is not reproducible, NetworkManager-wait-online.service isn't even a listed service.

Switching to the VDI with the older Rawhide, which is nevertheless as up to date as this one, still has NetworkManager-wait-online.service slowness. It has the same kernel and version of Network Manager as the new install. The two differences between them, oldslow is a Gnome packageset install on Btrfs; the newfast is a Minimal packageset install on ext4 yet NetworkManager completes 18x faster.

So something in the older install with Gnome packageset is pulling in NetworkManager-wait-online.service and also causing a delay. Is it possible this is virbr0-nic or TUN/TAP related? Since Gnome includes Boxes, it pulls in libvirt. And that's sorta pointless to have in a VM. So maybe that's why I have this problem in a VM, while it doesn't happen on baremetal?

# systemd-analyze blame
           725ms firewalld.service
           558ms sshd.service
           275ms NetworkManager.service
           144ms fedora-readonly.service
           126ms systemd-logind.service
           126ms plymouth-read-write.service
           125ms plymouth-start.service
           109ms systemd-udev-trigger.service
            99ms systemd-journal-flush.service
            99ms dev-mqueue.mount
            95ms dev-hugepages.mount
            95ms systemd-fsck-root.service
            91ms auditd.service
            83ms systemd-vconsole-setup.service
            72ms systemd-tmpfiles-setup-dev.service
            63ms polkit.service
            55ms systemd-user-sessions.service
            53ms plymouth-quit-wait.service
            53ms plymouth-quit.service
            47ms kmod-static-nodes.service
            39ms systemd-random-seed.service
            35ms user
            32ms systemd-update-utmp.service
            30ms systemd-remount-fs.service
            26ms systemd-tmpfiles-setup.service
            21ms systemd-sysctl.service
            19ms sys-kernel-debug.mount
            16ms systemd-udevd.service
            14ms tmp.mount
             4ms systemd-update-utmp-runlevel.service
             2ms sys-kernel-config.mount

Comment 8 Jaroslav Reznik 2015-03-03 15:45:30 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 9 Fedora Admin XMLRPC Client 2015-08-18 14:59:42 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 10 Fedora End Of Life 2016-07-19 11:26:54 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.