While debugging other issues, we noticed that ldconfig.service - which runs '/sbin/ldconfig -X' - appears to run on every Fedora live image boot, and on current F22 live images, takes 20-30 secs in a typical VM. This is substantially slowing down live image boot.
The service is conditional:
from the documentation for ConditionNeedsUpdate and the actual intended purpose of the service I'm not *entirely* sure if it's appropriate/necessary for it to run on live boot, and if not, whether the responsibility for making it not run should be on systemd or spin-kickstarts. So I'm filing on systemd to start with, but we can re-assign if necessary.
Do we actually want to run this on live boot? If not, is the problem that ConditionNeedsUpdate services shouldn't run on first boot of a system? Or if that's correct, do we need to add a ConditionKernelCommandLine to stop it running from live boots?
Proposing as an Alpha FE, on the grounds that if it's appropriate to not run this service it should be a relatively safe change and make Alpha lives boot a lot faster.
~50s boot delay on baremetal and VM is pretty icky.
Fedora-Live-Workstation-x86_64-22_Alpha-TC7.iso which uses systemd-219-4. There's nothing in the journal or systemctl status that explains why this takes so long.
[ 21.044998] localhost audispd: audispd initialized with q_depth=150 and 1 active plugins
[ 71.871867] localhost systemd: Started Rebuild Dynamic Linker Cache.
# systemctl status ldconfig.service
● ldconfig.service - Rebuild Dynamic Linker Cache
Loaded: loaded (/usr/lib/systemd/system/ldconfig.service; static; vendor preset: disabled)
Active: active (exited) since Sat 2015-02-28 22:14:40 EST; 34min ago
Process: 792 ExecStart=/sbin/ldconfig -X (code=exited, status=0/SUCCESS)
Main PID: 792 (code=exited, status=0/SUCCESS)
Feb 28 22:13:53 localhost systemd: Starting Rebuild Dynamic Linker Cache...
Feb 28 22:14:40 localhost systemd: Started Rebuild Dynamic Linker Cache.
Discussed at today's blocker review meeting .
This bug was accepted as Freeze Exception - This bug has been granted FE status. Please apply a fix before the next compose for testing.
It takes 280 ms in my VM. It have no idea why it is so slow for some people.
Anyway, this service can be safely disabled on a live image. The most obvious way would be to uninstall ldconfig.service, or to mask it. But actually a nicer option would be to touch /etc/.updated and /var/.updated after creating the image (some time after all packages have been installed and /usr will not be touched anymore). This would have the advantage that it would keep things closer to a normal installation and would also prevent any other service which is conditionalized on ConditionNeedsUpdate from needlessly running. systemd itself installs 5 of those, and not running them could shave some significant milliseconds from Live boot.
sure, that can work, we can do it in spin-kickstarts or livecd-tools i guess. We do also have ConditionKernelCommandLine=!rd.live.image , I think we've used that for other stuff.
So, I guess something like:
diff --git a/fedora-live-base.ks b/fedora-live-base.ks
index 8f2ddc2..264f118 100644
@@ -194,6 +194,10 @@ systemctl --no-reload disable atd.service 2> /dev/null || :
systemctl stop crond.service 2> /dev/null || :
systemctl stop atd.service 2> /dev/null || :
+# don't run ldconfig, it makes boot on live very slow
+systemctl --no-reload disable ldconfig.service 2> /dev/null || :
+systemctl stop ldconfig.service 2> /dev/null || :
# Mark things as configured
might work? I suppose I could push to rawhide and we can see how well it works there first?
(In reply to Kevin Fenzi from comment #6)
> So, I guess something like:
> diff --git a/fedora-live-base.ks b/fedora-live-base.ks
> index 8f2ddc2..264f118 100644
> --- a/fedora-live-base.ks
> +++ b/fedora-live-base.ks
> @@ -194,6 +194,10 @@ systemctl --no-reload disable atd.service 2> /dev/null
> || :
> systemctl stop crond.service 2> /dev/null || :
> systemctl stop atd.service 2> /dev/null || :
> +# don't run ldconfig, it makes boot on live very slow
> +systemctl --no-reload disable ldconfig.service 2> /dev/null || :
> +systemctl stop ldconfig.service 2> /dev/null || :
The service is a Type=oneshot service, so it's unlikely to be running, and stopping it is probably useless. The first part would work, but only for this service. Why not do the thing I suggested in comment #c4:
diff --git fedora-live-base.ks fedora-live-base.ks
index 8f2ddc29c3..785c1676c6 100644
@@ -305,6 +305,8 @@ if [ -x /usr/bin/fc-cache ] ; then
+echo 'File created by kickstart. See systemd-update-done.service(8).' \
+ | tee /etc/.updated >/var/.updated
ok, lets give it a shot.
Pushed to rawhide. Will see how it does tomorrow.
Seems to work. ;)
I'll cherry pick it over to f22 branch at some point.
I went ahead and pushed this to f22 also.
Please re-open if you see it again.
I can confirm that the fix prevent ldconfig from running, tested with workstation rawhide nigtly build from yesterday.
Boot time is about 1min30!
For some reason, NetworkManager-wait-online.service starts during boot of live images and increase the boot time too, about 30 secs.
Note, the unit isn't enabled.
In a local mate livecd build i mask the service and the boot time is normal again, about 30-35 secs in a VM.
So it looks like that a other service starts NetworkManager-wait-online.service or there is another reason for that.
But i checked dependencies of all units which are enabled, but none of them seems to start NetworkManager-wait-online.service.
What about ntpdate or something like that?
Mask network-online.target helps a lot but NetworkManager-wait-online.service starts nevertheless.
[root@localhost liveuser]# systemd-analyze
Startup finished in 1.274s (kernel) + 3.903s (initrd) + 30.755s (userspace) = 35.934s
[root@localhost liveuser]# systemd-analyze blame
[root@localhost liveuser]# systemctl status ntpd
[root@localhost liveuser]# systemctl status ntpdate.service
Loaded: masked (/dev/null)
Active: inactive (dead)
i meant mask ntpdate.service.
OK, nfs is the culprit. Good news is that it is already fixed (at least upstream), see see-also-ed bug.
confirm, i masked rpc-statd-notify.service and nfs-utils.service, boot time is OK now and NetworkManager-wait-online.service don't starts.
[liveuser@localhost ~]$ systemd-analyze
Startup finished in 1.310s (kernel) + 4.122s (initrd) + 25.996s (userspace) = 31.429s
[liveuser@localhost ~]$ systemd-analyze blame
Confirmed that nfs-utils-1.3.2-2.0.fc22 fix the issue with mate-compiz live spin (local build).
NetworkManager-wait-online.service doesn't start anymore.
Is there something planned for F21? I've the same problem with my live remix:
# systemd-analyze blame
Maybe I should apply patch proposed in comment #7 ...
We don't typically change stable releases after they happen (since we don't really have a good process to respin all the media).
So, yeah, I'd say apply a version of the patch to your remix.
(In reply to Kevin Fenzi from comment #19)
> We don't typically change stable releases after they happen (since we don't
> really have a good process to respin all the media).
Because we don't pay attention to Torvlad's suggestions...
> So, yeah, I'd say apply a version of the patch to your remix.
OK, I see the change in recent fedora-live-base.ks
We pay attention to everyone who suggests it, but talk is cheap and work is expensive. Only a couple of people have been interested in actually doing the work to do respins *properly*, never enough at one time to make it a really official project. J.B. Williams (southern_gentleman on IRC) currently does hemi-demi-semi-unofficial respins, see his blog: https://jbwillia.wordpress.com/
(In reply to Adam Williamson from comment #21)
> J.B. Williams (southern_gentleman on IRC) currently
> does hemi-demi-semi-unofficial respins, see his blog:
Interesting. But mine is a fully-totally-wildly-unauthorized source-only-remix: