Bug 1282232

Summary: mlocate-updatedb.service doubles time taken to boot
Product: [Fedora] Fedora Reporter: Ron Yorston <rmy>
Component: mlocateAssignee: Michal Sekletar <msekleta>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: 30CC: alexey.brodkin, fabrice.salvaire, hedayatv, mislav.stublic, msekleta, sassmann
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: mlocate-0.26-13.fc23 mlocate-0.26-14.fc24 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-26 17:47:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1708042    
Bug Blocks:    
Attachments:
Description Flags
Output of systemd-analyze plot after a 24 hour shutdown
none
Output of systemd-analyze plot after a short shutdown
none
Output of systemd-analyze plot with mlocate-0.26-13.fc23
none
Boot plot, first boot, 0.26-13.fc23, RandomizedDelaySec=3600
none
Boot plot, second boot, 0.26-13.fc23, RandomizedDelaySec=3600
none
mlocate entries in log, first boot
none
mlocate entries in log, second boot
none
Output of journalctl -b
none
Output of iotop during boot
none
Revert to using cron to run update
none
Revert to using cron to run update
none
Revert to using cron to run update none

Description Ron Yorston 2015-11-15 17:14:48 UTC
Created attachment 1094517 [details]
Output of systemd-analyze plot after a 24 hour shutdown

Description of problem:

If my laptop has been switched off for more than 24 hours the mlocate-updatedb timer causes the mlocate-updatedb service to run during boot.  This results in the boot taking over 2 minutes rather than just under 1 minute.


Version-Release number of selected component (if applicable):

mlocate-0.26-12.fc23.x86_64


How reproducible:

I've used systemd-analyze to examine two boots after a 24 hour shutdown, each followed by a reboot.  In both cases the results were the same.


Steps to Reproduce:
1. Leave a machine switched off for 24 hours.
2. Switch it on.
3.

Actual results:
The machine takes two minutes to boot.

Expected results:
The machine takes one minute to boot.

Additional info:

The mlocate-updatedb timer is set to run daily and has 'Persistent=true'.  According to the systemd.timer man page the latter means that:

When the timer is activated, the service unit is triggered immediately if it would have been triggered at least once during the time when the timer was     inactive.

When the machine boots after being off for 24 hours or more the timer is triggered during the boot process.  This causes the mlocate-updatedb service to run immediately, rattling the hard disk and causing the boot to take much longer than usual.

Comment 1 Ron Yorston 2015-11-15 17:15:55 UTC
Created attachment 1094518 [details]
Output of systemd-analyze plot after a short shutdown

Comment 2 Ron Yorston 2015-11-16 08:30:10 UTC
Actually, I was wrong.  The machine doesn't have to be off for 24 hours to exhibit the problem.  It seems it just has to be off at midnight when the daily timer would usually run.

When I started the machine this morning it took 115 seconds to boot to the GDM login despite only having been off for ten hours.

Comment 3 Hedayat Vatankhah 2015-12-31 08:30:12 UTC
systemd's lack of random delay functionality of cron is hitting us. I've seen that there is a feature request for that. But till then, it seems that manually putting a random sleep before running updatedb is the workaround. I'd suggest either reverting to using cron for running updatedb for now, or to put a random or specific sleep before running updatedb: e.g. sleep 1h 
It is really annoying to see that sometimes your system takes 3 minutes to boot.

Comment 4 M Hickford 2016-03-16 06:41:00 UTC
Via http://unix.stackexchange.com/questions/270005/why-is-boot-slow

Can anyone describe a workaround?

Comment 5 Ron Yorston 2016-03-16 08:11:08 UTC
I've just modified the timer to run once a week rather than every day:

sed 's/daily/weekly/' /usr/lib/systemd/system/mlocate-updatedb.timer >/etc/systemd/system/mlocate-updatedb.timer

Now I only have to tolerate the slow boot on Monday.

Comment 6 Hedayat Vatankhah 2016-03-16 11:45:31 UTC
I'd suggest the following workaround instead: delay mlocate-updatedb.service startup for a few minutes (e.g. 10 minutes) so that if it need to run at system boot, it is started after awhile.

The following will do this, and it won't be replaced if mlocate package is updated:

-----------------------------------------------------------------------------
mkdir /etc/systemd/system/mlocate-updatedb.service.d
cat <<EOF > /etc/systemd/system/mlocate-updatedb.service.d/mlocate-updatedb.conf
[Service]
ExecStart=
ExecStart=/bin/sleep 10m
ExecStart=/usr/libexec/mlocate-run-updatedb
EOF
-----------------------------------------------------------------------------

This has the advantage that it should prevent slow boot (Not tested yet, but it should do so, and also will still update locate database everyday.

Comment 7 Hedayat Vatankhah 2016-03-16 11:50:47 UTC
And I found a solution (at least for updated Fedora 23): systemd timers now have a new option: RandomizedDelaySec, which seems to provide the solution for this problem.So, adding a line like this should fix the problem too:
RandomizedDelaySec=30m

(Notice that the delay is still random, so sometimes it might be very small and cause slow boots.)

Comment 8 Fedora Update System 2016-03-18 14:14:15 UTC
mlocate-0.26-13.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-dd988b6d40

Comment 9 Fedora Update System 2016-03-18 14:15:23 UTC
mlocate-0.26-14.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-85f0a2950b

Comment 10 Fedora Update System 2016-03-19 01:23:18 UTC
mlocate-0.26-13.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-dd988b6d40

Comment 11 Ron Yorston 2016-03-19 07:47:49 UTC
I installed mlocate-0.26-13.fc23 yesterday and the boot this morning was even slower than before, nearly three minutes.

It's really not a good idea to have mlocate competing for disk bandwidth during boot.

I tried RandomizedDelaySec the day before yesterday but got some odd results, I'll need to investigate that option further.

Comment 12 Ron Yorston 2016-03-19 07:49:22 UTC
Created attachment 1137983 [details]
Output of systemd-analyze plot with mlocate-0.26-13.fc23

Comment 13 Hedayat Vatankhah 2016-03-19 08:45:26 UTC
I've not tried the new update, nor RandomizedDelaySec yet. And I agree with Ron that running an IO intensive process during boot is not a good idea anyway, even if it is done in parallel (and it is not weird to see it slows-down boot even more, as it can cause more random reads). 

To me, sleep solution is even better than unpredictable RandomizedDelaySec: it makes sure that updatedb doesn't run during boot, and not right after boot in which I usually start the applications I need. 

But sleep solution looks like a hack, and RandomizedDelaySec seems more appropriate. However, I'd say that systemd should probably never run pending timer events during boot, and schedule them after boot instead. I wonder if there is a case where delaying starting timer units after boot could be harmful. I'm considering filling a separate bug/feature request against systemd.

Comment 14 Hedayat Vatankhah 2016-03-19 08:48:43 UTC
(In reply to Ron Yorston from comment #11)
> ...
> I tried RandomizedDelaySec the day before yesterday but got some odd
> results, I'll need to investigate that option further.

Make sure that you add it to .timer file rather than the .service file.

Comment 15 Fedora Update System 2016-03-20 03:58:36 UTC
mlocate-0.26-14.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-85f0a2950b

Comment 16 Ron Yorston 2016-03-21 08:34:57 UTC
I left the configuration unchanged on the 19th (so still running stock 0.26-13.fc23).  The first boot on 20th took two and a half minutes.

After that I changed the timer configuration, adding 'RandomizedDelaySec=3600'.

Today, the 21st, the first boot took two and a half minutes and a subsequent reboot took one and a quarter minutes.  So the situation is unchanged:  the first boot of the day still takes twice as long as subsequent boots.

I have no idea why RandomizedDelaySec seems to have had no effect.  Nor do I know why the log of the first boot shows the timer being started twice with different random delays.

I'll attach plots and logs for the two boots today.

Comment 17 Ron Yorston 2016-03-21 08:36:38 UTC
Created attachment 1138484 [details]
Boot plot, first boot, 0.26-13.fc23, RandomizedDelaySec=3600

Comment 18 Ron Yorston 2016-03-21 08:37:46 UTC
Created attachment 1138485 [details]
Boot plot, second boot, 0.26-13.fc23, RandomizedDelaySec=3600

Comment 19 Ron Yorston 2016-03-21 08:38:40 UTC
Created attachment 1138487 [details]
mlocate entries in log, first boot

Comment 20 Ron Yorston 2016-03-21 08:39:11 UTC
Created attachment 1138488 [details]
mlocate entries in log, second boot

Comment 21 Fedora Update System 2016-03-22 19:52:32 UTC
mlocate-0.26-13.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 22 Ron Yorston 2016-03-22 21:20:17 UTC
mlocate-0.26-13.fc23 does not fix this bug.

Comment 23 Michal Sekletar 2016-03-24 09:00:52 UTC
(In reply to Ron Yorston from comment #22)
> mlocate-0.26-13.fc23 does not fix this bug.

Does mlocate-updatedb.service block login or boot is overall slower due to heavy I/O activity?

Comment 24 Ron Yorston 2016-03-24 19:33:33 UTC
>Does mlocate-updatedb.service block login

I don't know how to answer this question.  How can I tell if the service is blocking login?

>or boot is overall slower due to heavy I/O activity?

How can I measure I/O activity during boot?

Comment 25 Mislav Stublic 2016-03-25 07:39:13 UTC
(In reply to Michal Sekletar from comment #23)
> Does mlocate-updatedb.service block login or boot is overall slower due to
> heavy I/O activity?

Login is not blocked but the boot takes a long time. When you eventually get to login screen everything is fine.

Also, I can confirm/reproduce this bug, so if some logs are needed i would be glad to help. The wierd thing is that if you do updatedb immediately after boot it still takes a long time, as if db was not updated at all. The other thing is that i'm not sure why it should take that long every day, it is not as if much has changed on filesystem in one day, at least not on my laptop.

It is really annoying, especially if you go to sleep before midnight as many working people do, you have to wait two minutes to boot the next day. Here is systemd-analyze blame from this morning:
"1min 44.635s mlocate-updatedb.service
 1min 1.633s plymouth-quit-wait.service
..."

Comment 26 Fedora Update System 2016-03-26 17:58:51 UTC
mlocate-0.26-14.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 27 Michal Sekletar 2016-03-29 15:03:24 UTC
(In reply to Ron Yorston from comment #24)
> >Does mlocate-updatedb.service block login
> 
> I don't know how to answer this question.  How can I tell if the service is
> blocking login?

If it is not blocking login then you should be able to login before it finishes. If service doesn't exit in the meantime then you should be able to see it is still running when you execute systemctl status mlocate-updatedb.service.

> 
> >or boot is overall slower due to heavy I/O activity?
> 
> How can I measure I/O activity during boot?

Maybe something more sophisticated exists but I think that running iotop -toa from unit file can give you some basic insight.

Comment 28 Ron Yorston 2016-03-29 21:07:01 UTC
Well, I guess the service isn't blocking login because ps showed that it was still running after I logged in today.  And the systemctl command said that the service became inactive about 50 seconds after I logged in.

I'll see if I can get anything from iotop tomorrow.

Comment 29 Ron Yorston 2016-03-31 07:48:23 UTC
Created attachment 1142089 [details]
Output of journalctl -b

First boot today.  The highlights:

Start boot     09:12:19 (?)
mlocate starts 08:12:54
rmy login      08:14:44
mlocate stops  08:15:37

Comment 30 Ron Yorston 2016-03-31 07:50:12 UTC
Created attachment 1142090 [details]
Output of iotop during boot

Much i/o by updatedb.

Comment 31 Hedayat Vatankhah 2016-06-27 10:50:21 UTC
(In reply to Hedayat Vatankhah from comment #14)
> (In reply to Ron Yorston from comment #11)
> > ...
> > I tried RandomizedDelaySec the day before yesterday but got some odd
> > results, I'll need to investigate that option further.
> 
> Make sure that you add it to .timer file rather than the .service file.

RandomizedDelaySec doesn't work. Apparently, it only affects the scheduled time of running the timer (e.g. run on 00:10 rather than 00:00), but doesn't affect running the service immediately if it is a Persistent service and has not run on time. So, the service will still run immediately on boot if the system was off when the service was supposed to run.

---
And about the sleep workaround given in #6, it no longer works since the service is not of type oneshot anymore. sleep can be added to ExecStartPre= instead, but it should not take long otherwise it'll be killed by systemd. 1min works for me.

Comment 32 Fedora End Of Life 2016-11-24 13:27:04 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 33 Ron Yorston 2016-11-25 08:48:25 UTC
Hedayat, thanks for updating the version.

I've just booted my new F25 installation on the laptop where I first noted this issue and can confirm that it took over two minutes.  The problem persists.

For internal deployment I've made my own mlocate RPM which reverts the use of systemd services in favour of cron.

Comment 34 Ron Yorston 2017-07-16 08:35:34 UTC
I've installed Fedora 26 and this bug persists.  The first boot of the day takes just under two minutes while subsequent boots take just under a minute.

I know there's never a good time to update the mlocate database:  it will consume i/o bandwidth which may prove to be annoying.  However, performing the update while the system is booting is almost certainly the worst possible time.

If it really is impossible to configure the systemd timer/service to avoid running the update during boot I strongly recommend reverting to using cron, which is the solution I've adopted for my systems.

Comment 35 Fedora End Of Life 2017-11-16 19:53:04 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 36 Hedayat Vatankhah 2018-04-16 09:09:19 UTC
Setting IOSchedulingClass=3 seems to help here. Although I'm still not sure (not tried to actually boot with this settings when updatedb should be run to see if it doesn't affect boot at all).

Anyway, this bug has a really bad impression on Fedora users. If cannot be fixed using systemd, I highly suggest using cron again.

Comment 37 Ron Yorston 2018-05-03 07:17:30 UTC
Still an issue in F28.

First boot:

Startup finished in 2.563s (kernel) + 4.992s (initrd) + 2min 25.399s (userspace) = 2min 32.955s
graphical.target reached after 2min 24.863s in userspace

Subsequent boot:

Startup finished in 2.585s (kernel) + 4.782s (initrd) + 48.069s (userspace) = 55.436s
graphical.target reached after 48.047s in userspace

Comment 38 Ron Yorston 2018-05-03 07:18:43 UTC
Created attachment 1430494 [details]
Revert to using cron to run update

Comment 39 Ron Yorston 2018-11-03 21:10:46 UTC
Still an issue in F29.

First boot:

Startup finished in 2.230s (kernel) + 5.154s (initrd) + 2min 21.354s (userspace) = 2min 28.739s
graphical.target reached after 2min 20.116s in userspace

Subsequent boot:

Startup finished in 2.211s (kernel) + 5.030s (initrd) + 53.406s (userspace) = 1min 649ms
graphical.target reached after 53.384s in userspace

Comment 40 Ron Yorston 2018-11-03 21:12:01 UTC
Created attachment 1501092 [details]
Revert to using cron to run update

Comment 41 Fabrice Salvaire 2019-02-12 21:56:45 UTC
I tested IOSchedulingClass=idle. It improves the responsiveness of the machine but still not perfect.

It is a shame to have such bug on Linux in 2019. All theses services : mlocate, tracker ... (IO bandwith) and package update (network bandwith) can literally overload a computer during 15 min after boot.

Comment 42 Hedayat Vatankhah 2019-02-13 04:03:48 UTC
Yeah, it's a shame. And only because of jumping to systemd with no real benefit that I'm aware of. The maintainer simply ignores it, and apparently many either don't use HDDs or don't boot their machines often, or their machine is ON on that time. But this is so shameful that I'm afraid of telling anyone to use Fedora.

The only hope I have is that there is a PR upstream now that will (hopefully) fix the problem if accepted, and if the maintainer uses the new switch. But suffering from such a bug since 2015 is really shameful.

Comment 43 Hedayat Vatankhah 2019-02-21 23:00:11 UTC
Well, finally the upstream bug is fixed: https://github.com/systemd/systemd/issues/5659
Apparenlty, the RandomizedDelaySec will also work in this case when the patch is applied.

Comment 44 Ron Yorston 2019-05-07 18:35:48 UTC
Still an issue in Fedora 30.  First boot of the day 1 min 54s, second boot 53s.

Comment 45 Ron Yorston 2019-05-07 18:37:10 UTC
Created attachment 1565327 [details]
Revert to using cron to run update

Comment 46 Hedayat Vatankhah 2019-05-09 03:32:37 UTC
Looks like it is fixed in systemd 242, which is not in F30. I wonder if it can be backported to F30 systemd, specially that it is a two line patch.

Comment 47 Hedayat Vatankhah 2019-05-09 03:47:41 UTC
I've filled bug #1708042 against systemd. When merged, AFAIK adding RandomizedDelaySec to mlocate service file should fix the problem.

Comment 48 Hedayat Vatankhah 2019-11-05 10:12:54 UTC
I can confirm that adding RandomizedDelaySec to mlocate-updatedb.timer will fix the problem in F31 with systemd 243. 
I've added this file: /etc/systemd/system/mlocate-updatedb.timer.d/delay.conf with this content and it nicely makes the updatedb to (usually) not happen at boot time. You can increase the value to make it less likely to happen soon after bootup:

[Timer]
RandomizedDelaySec=2000

Comment 49 Ben Cotton 2020-04-30 20:45:37 UTC
This message is a reminder that Fedora 30 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 30 on 2020-05-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '30'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 30 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 50 Ben Cotton 2020-05-26 17:47:21 UTC
Fedora 30 changed to end-of-life (EOL) status on 2020-05-26. Fedora 30 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.