While debugging other issues, we noticed that ldconfig.service - which runs '/sbin/ldconfig -X' - appears to run on every Fedora live image boot, and on current F22 live images, takes 20-30 secs in a typical VM. This is substantially slowing down live image boot. The service is conditional: ConditionNeedsUpdate=/etc from the documentation for ConditionNeedsUpdate and the actual intended purpose of the service I'm not *entirely* sure if it's appropriate/necessary for it to run on live boot, and if not, whether the responsibility for making it not run should be on systemd or spin-kickstarts. So I'm filing on systemd to start with, but we can re-assign if necessary. Do we actually want to run this on live boot? If not, is the problem that ConditionNeedsUpdate services shouldn't run on first boot of a system? Or if that's correct, do we need to add a ConditionKernelCommandLine to stop it running from live boots? Thanks!
Proposing as an Alpha FE, on the grounds that if it's appropriate to not run this service it should be a relatively safe change and make Alpha lives boot a lot faster.
~50s boot delay on baremetal and VM is pretty icky. Fedora-Live-Workstation-x86_64-22_Alpha-TC7.iso which uses systemd-219-4. There's nothing in the journal or systemctl status that explains why this takes so long. [ 21.044998] localhost audispd[1181]: audispd initialized with q_depth=150 and 1 active plugins [ 71.871867] localhost systemd[1]: Started Rebuild Dynamic Linker Cache. # systemctl status ldconfig.service ● ldconfig.service - Rebuild Dynamic Linker Cache Loaded: loaded (/usr/lib/systemd/system/ldconfig.service; static; vendor preset: disabled) Active: active (exited) since Sat 2015-02-28 22:14:40 EST; 34min ago Docs: man:ldconfig(8) Process: 792 ExecStart=/sbin/ldconfig -X (code=exited, status=0/SUCCESS) Main PID: 792 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ldconfig.service Feb 28 22:13:53 localhost systemd[1]: Starting Rebuild Dynamic Linker Cache... Feb 28 22:14:40 localhost systemd[1]: Started Rebuild Dynamic Linker Cache.
Discussed at today's blocker review meeting [1]. This bug was accepted as Freeze Exception - This bug has been granted FE status. Please apply a fix before the next compose for testing. http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-02/
It takes 280 ms in my VM. It have no idea why it is so slow for some people. Anyway, this service can be safely disabled on a live image. The most obvious way would be to uninstall ldconfig.service, or to mask it. But actually a nicer option would be to touch /etc/.updated and /var/.updated after creating the image (some time after all packages have been installed and /usr will not be touched anymore). This would have the advantage that it would keep things closer to a normal installation and would also prevent any other service which is conditionalized on ConditionNeedsUpdate from needlessly running. systemd itself installs 5 of those, and not running them could shave some significant milliseconds from Live boot.
sure, that can work, we can do it in spin-kickstarts or livecd-tools i guess. We do also have ConditionKernelCommandLine=!rd.live.image , I think we've used that for other stuff.
So, I guess something like: diff --git a/fedora-live-base.ks b/fedora-live-base.ks index 8f2ddc2..264f118 100644 --- a/fedora-live-base.ks +++ b/fedora-live-base.ks @@ -194,6 +194,10 @@ systemctl --no-reload disable atd.service 2> /dev/null || : systemctl stop crond.service 2> /dev/null || : systemctl stop atd.service 2> /dev/null || : +# don't run ldconfig, it makes boot on live very slow +systemctl --no-reload disable ldconfig.service 2> /dev/null || : +systemctl stop ldconfig.service 2> /dev/null || : + # Mark things as configured touch /.liveimg-configured might work? I suppose I could push to rawhide and we can see how well it works there first?
(In reply to Kevin Fenzi from comment #6) > So, I guess something like: > > diff --git a/fedora-live-base.ks b/fedora-live-base.ks > index 8f2ddc2..264f118 100644 > --- a/fedora-live-base.ks > +++ b/fedora-live-base.ks > @@ -194,6 +194,10 @@ systemctl --no-reload disable atd.service 2> /dev/null > || : > systemctl stop crond.service 2> /dev/null || : > systemctl stop atd.service 2> /dev/null || : > > +# don't run ldconfig, it makes boot on live very slow > +systemctl --no-reload disable ldconfig.service 2> /dev/null || : > +systemctl stop ldconfig.service 2> /dev/null || : The service is a Type=oneshot service, so it's unlikely to be running, and stopping it is probably useless. The first part would work, but only for this service. Why not do the thing I suggested in comment #c4: diff --git fedora-live-base.ks fedora-live-base.ks index 8f2ddc29c3..785c1676c6 100644 --- fedora-live-base.ks +++ fedora-live-base.ks @@ -305,6 +305,8 @@ if [ -x /usr/bin/fc-cache ] ; then fc-cache -f fi +echo 'File created by kickstart. See systemd-update-done.service(8).' \ + | tee /etc/.updated >/var/.updated %end
ok, lets give it a shot. Pushed to rawhide. Will see how it does tomorrow.
Seems to work. ;) I'll cherry pick it over to f22 branch at some point.
I went ahead and pushed this to f22 also. Please re-open if you see it again.
I can confirm that the fix prevent ldconfig from running, tested with workstation rawhide nigtly build from yesterday. Boot time is about 1min30! For some reason, NetworkManager-wait-online.service starts during boot of live images and increase the boot time too, about 30 secs. Note, the unit isn't enabled. In a local mate livecd build i mask the service and the boot time is normal again, about 30-35 secs in a VM. So it looks like that a other service starts NetworkManager-wait-online.service or there is another reason for that. But i checked dependencies of all units which are enabled, but none of them seems to start NetworkManager-wait-online.service.
What about ntpdate or something like that?
Mask network-online.target helps a lot but NetworkManager-wait-online.service starts nevertheless. [root@localhost liveuser]# systemd-analyze Startup finished in 1.274s (kernel) + 3.903s (initrd) + 30.755s (userspace) = 35.934s [root@localhost liveuser]# systemd-analyze blame 10.930s firewalld.service 9.441s livesys.service 8.095s accounts-daemon.service 6.350s NetworkManager-wait-online.service 3.677s rsyslog.service 3.556s gssproxy.service 2.785s proc-fs-nfsd.mount 2.490s lvm2-monitor.service 1.850s systemd-journald.service 1.849s fedora-readonly.service 1.843s dmraid-activation.service 1.820s polkit.service 1.744s systemd-logind.service 1.704s systemd-udev-settle.service 1.528s NetworkManager.service 1.475s chronyd.service 1.460s systemd-tmpfiles-setup-dev.service 1.368s fedora-import-state.service 1.338s spice-vdagentd.service 1.317s systemd-udev-trigger.service 1.266s avahi-daemon.service 1.218s tmp.mount 1.121s kmod-static-nodes.service 866ms systemd-sysctl.service 860ms mcelog.service 825ms plymouth-read-write.service 807ms udisks2.service <snip> [root@localhost liveuser]# systemctl status ntpd ntpdate.service ntpd.service [root@localhost liveuser]# systemctl status ntpdate.service ● ntpdate.service Loaded: masked (/dev/null) Active: inactive (dead)
opps, sorry i meant mask ntpdate.service.
OK, nfs is the culprit. Good news is that it is already fixed (at least upstream), see see-also-ed bug.
confirm, i masked rpc-statd-notify.service and nfs-utils.service, boot time is OK now and NetworkManager-wait-online.service don't starts. [liveuser@localhost ~]$ systemd-analyze Startup finished in 1.310s (kernel) + 4.122s (initrd) + 25.996s (userspace) = 31.429s [liveuser@localhost ~]$ systemd-analyze blame 14.127s firewalld.service 13.048s livesys.service 12.560s accounts-daemon.service 10.961s dev-sr0.device 9.899s dev-mapper-live\x2drw.device 6.338s spice-vdagentd.service 5.985s gssproxy.service 5.979s mcelog.service 5.970s nfs-config.service 5.962s rsyslog.service 5.930s systemd-logind.service 5.892s rtkit-daemon.service 5.881s avahi-daemon.service 5.833s chronyd.service 4.457s systemd-udev-settle.service 3.072s proc-fs-nfsd.mount 2.587s lvm2-monitor.service 2.408s systemd-journald.service 2.349s polkit.service 2.222s udisks2.service 2.149s systemd-tmpfiles-setup-dev.service 2.105s fedora-readonly.service 1.770s systemd-udev-trigger.service 1.358s systemd-sysctl.service 833ms dev-mqueue.mount 753ms NetworkManager.service <snip>
Confirmed that nfs-utils-1.3.2-2.0.fc22 fix the issue with mate-compiz live spin (local build). NetworkManager-wait-online.service doesn't start anymore. see https://bugzilla.redhat.com/show_bug.cgi?id=1183293#c14
Is there something planned for F21? I've the same problem with my live remix: # systemd-analyze blame 32.897s ldconfig.service 6.918s systemd-udev-settle.service 5.101s plymouth-quit-wait.service 3.042s firewalld.service 2.658s systemd-udev-hwdb-update.service 2.270s accounts-daemon.service Maybe I should apply patch proposed in comment #7 ...
We don't typically change stable releases after they happen (since we don't really have a good process to respin all the media). So, yeah, I'd say apply a version of the patch to your remix.
(In reply to Kevin Fenzi from comment #19) > We don't typically change stable releases after they happen (since we don't > really have a good process to respin all the media). > Because we don't pay attention to Torvlad's suggestions... ;-) > So, yeah, I'd say apply a version of the patch to your remix. OK, I see the change in recent fedora-live-base.ks Thank you
We pay attention to everyone who suggests it, but talk is cheap and work is expensive. Only a couple of people have been interested in actually doing the work to do respins *properly*, never enough at one time to make it a really official project. J.B. Williams (southern_gentleman on IRC) currently does hemi-demi-semi-unofficial respins, see his blog: https://jbwillia.wordpress.com/
(In reply to Adam Williamson from comment #21) > J.B. Williams (southern_gentleman on IRC) currently > does hemi-demi-semi-unofficial respins, see his blog: > https://jbwillia.wordpress.com/ Interesting. But mine is a fully-totally-wildly-unauthorized source-only-remix: https://www.assembla.com/code/fedora-remix/subversion/nodes/ cheers