Bug 1052173 - fdatasync() in update-mime-database, results in large performance degradation
Summary: fdatasync() in update-mime-database, results in large performance degradation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: shared-mime-info
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Rex Dieter
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1124021 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-13 12:53 UTC by Tom Horsley
Modified: 2016-05-11 11:02 UTC (History)
12 users (show)

Fixed In Version: shared-mime-info-1.2-7.fc20
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-04 12:30:42 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
FreeDesktop.org 70366 0 None None None Never

Description Tom Horsley 2014-01-13 12:53:57 UTC
Description of problem:

The update-mime-database program in f20 takes about 20 minutes to run on my
system. Judging from https://bugs.freedesktop.org/show_bug.cgi?id=70366
the reason is that it deliberately turns on synchronous disc I/O.

This is absurd and ridiculous. Why should this one program consider itself so important that it must sync all I/O? If corruption due to not syncing is such a problem, shouldn't the entire system be switched to always use sync I/O?

To top off the absurdity, the mime database is one of the most incredibly unimportant bits of data on the system. Even if it got corrupted, you could re-run update-mime-database to fix it. Why does it need to slow down by a factor of about 30 to protect unimportant data when nothing else on the system feels that linux disk I/O is so unreliable that other things need the same treatment?

Please change the rpm build script for this to export ac_cv_func_fdatasync=no


Version-Release number of selected component (if applicable):
shared-mime-info-1.2-1.fc20.x86_64

How reproducible:
Every time I run a yum update, it spends 20 minutes grinding the disk just before the verify step of yum.

Steps to Reproduce:
1.yum update
2.watch cleanup take forever
3.

Actual results:
Dog slow

Expected results:
Speed it used to have before this change

Additional info:

Comment 1 Artur 2014-02-06 07:34:49 UTC
In addition - upgrade to f20 takes ages because of this...
Please, the fix/workaround is so simple...

Comment 2 Daniel Kian Mc Kiernan 2014-02-22 08:05:31 UTC
What happens between the cleanup phase and the verification phase is egregious.  The pause is lengthy, it is unexplained, and the disc is made to work furiously.

Comment 3 Tom Horsley 2014-04-17 11:46:39 UTC
20 minutes ago my disk started grinding at the end of this morning's yum update. It is still grinding and there is no indication it will ever stop. I'm begging you! Please fix this monstrosity!

Comment 4 Daniel Kian Mc Kiernan 2014-04-18 18:52:18 UTC
Let's assume, for the sake of discussion, that there truly is a good reason to use synchronous I/O.

The fact remains that, in between the second stage (clean-up) and the alleged third stage, there is an unacknowledge stage that takes a very long time and entails a signficant amount of disc activity.  A great many users will think that things have gone very wrong, and will respond by shutting down.  At the very least, users need to be told that this stage is part of the process.  And all that it take to do this is one line of output to the console.

The assignee, Bastien Nocera, has time to pirouette in his 'blog and on varioous social media, but apparently no time to code for that one line of output to the console.  So be it.  Would it be possible to get a different assignee, with different priorities?

Comment 5 Rex Dieter 2014-05-20 14:02:50 UTC
I'm hitting this horribly too at our site starting to do wide f20 deployments, I'll see about moving this along.

Bastian's upstream comment:
"The correct way to fix this is for your system not to run update-mime-database for each package, put for the whole set instead ("triggers")."

Unfortunately, doesn't mesh with fedora's current packaging guidelines,
https://fedoraproject.org/wiki/Packaging:ScriptletSnippets?rd=Packaging/ScriptletSnippets#mimeinfo

As far as I'm aware, update-mime-database currently doesn't have smarts like gtk-update-icon-cache to know when it's cache is current or not, to take advantage of any %posttrans optimization to run only once per transaction , like:
https://fedoraproject.org/wiki/Packaging:ScriptletSnippets?rd=Packaging/ScriptletSnippets#Icon_Cache

Please correct me if I'm wrong, and I'll gladly work to get the guidelines amended to take advantage of it.  If correct, it would benefit greatly if update-mime-database did have some mechanism like this (for example, similar to icon cache and checking timetstamp of /usr/share/mime)

Lacking any mechanism for the aforementioned issue, I'll lobby to implement something like the initial suggestions in this bug to restore reasonable performance.

Comment 6 Rex Dieter 2014-05-20 14:15:30 UTC
Just so we have some numbers here (and not just in the upstream bug)...

On my oldish ~6 yr-old laptop:

fedora stock update-mime-database call time: ~34 seconds
rebuild without fdatasync u-m-d call time: ~0.7 seconds

Now, imagine rpm transactions that include 5, 10, 20 (or more) such calls. :(

Comment 7 Tom Horsley 2014-05-20 14:35:33 UTC
On my system at work where the disk sounds like it is being tortured for several minutes at a time, I can't believe it is good for the long term health of the disk either.

I've seriously considered adding a system wide LD_PRELOAD environment variable to override fdatasync() so it is a no-op, but the thought that there might be some legitimate use for the call has stopped me.

Comment 8 Rex Dieter 2014-05-20 15:33:23 UTC
Colin, what do you think per my proposed way forward in comment #5 ?

Comment 9 Colin Walters 2014-05-20 15:52:32 UTC
All my opinions are in the upstream https://bugs.freedesktop.org/show_bug.cgi?id=70366

Comment 10 Rex Dieter 2014-05-20 17:04:45 UTC
The only comment I see is that you'd support some environment variable to turn the sync off, via something like PKGSYSTEM_DISABLE_FSYNC=1

That doesn't help fix this in fedora 20.  I'm advocating reverting the sync requirement now, until something better comes along.

Comment 11 Fedora Update System 2014-05-21 15:55:02 UTC
shared-mime-info-1.2-4.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/shared-mime-info-1.2-4.fc20

Comment 12 Rex Dieter 2014-05-21 15:59:27 UTC
This test build enables support for PKGSYSTEM_ENABLE_FSYNC environment variable to opt-in to the new safer (but slower) writes as suggested in
https://bugs.freedesktop.org/show_bug.cgi?id=70366#c19

Comment 13 Fedora Update System 2014-05-23 18:54:24 UTC
Package shared-mime-info-1.2-4.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing shared-mime-info-1.2-4.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-6601/shared-mime-info-1.2-4.fc20
then log in and leave karma (feedback).

Comment 14 Rex Dieter 2014-06-26 11:59:24 UTC
Update: Bastien "fixed" the upstream bug report by making PKGSYSTEM_ENABLE_FSYNC opt-out and default on, which doesn't help our use-case here.

I guess I'll adapt the patch to the upstream implementation, except flip the default off instead.

Comment 15 Bastien Nocera 2014-06-26 12:16:16 UTC
(In reply to Rex Dieter from comment #14)
> Update: Bastien "fixed" the upstream bug report by making
> PKGSYSTEM_ENABLE_FSYNC opt-out and default on, which doesn't help our
> use-case here.
> 
> I guess I'll adapt the patch to the upstream implementation, except flip the
> default off instead.

Don't. Make rpm/yum/whatever take their responsibility and use the envvar.

Comment 16 Tom Horsley 2014-06-26 12:20:46 UTC
Or just add a system wide /etc/profile file to the package that sets the envvar.

Comment 17 Rex Dieter 2014-06-26 12:58:47 UTC
I understand your point of view Bastien, but I respectfully disagree.  I would argue if you want that approach, fixing rpm/yum/whatever is a *prerequisite* to (re)enabling fdatasync by default.  Fixing the performance regression is more important.

Comment 18 Rex Dieter 2014-06-26 13:04:17 UTC
I've revoked
https://admin.fedoraproject.org/updates/FEDORA-2014-6601
for now to respect your formal opposition.

Would you be open to me asking FESCo to arbitrate?

Comment 19 Bastien Nocera 2014-06-26 14:03:57 UTC
(In reply to Rex Dieter from comment #18)
> I've revoked
> https://admin.fedoraproject.org/updates/FEDORA-2014-6601
> for now to respect your formal opposition.
> 
> Would you be open to me asking FESCo to arbitrate?

I'll drop Fedora maintainership of this package if I have to justify myself again. I've already done so as the upstream maintainer.

Comment 20 Rex Dieter 2014-06-26 14:15:14 UTC
I'm sorry you feel that way. 

I do continue to feel strongly about this bug, however, and we do still have an impasse and difference of opinions.  I will likely move forward with asking FESCo to look at this.

I'll be happy to help maintain this package in fedora moving forward, if you feel you cannot or do not want to continue doing so.

Comment 21 Rex Dieter 2014-06-26 15:31:56 UTC
Fyi,
https://fedorahosted.org/fesco/ticket/1318

please add your opinion, especially if you feel I mischaracterized matters in any way.

Comment 22 Kevin Kofler 2014-06-26 15:58:58 UTC
I'm with Rex Dieter, the performance impact of the fdatasync is way too high here to enable it by default, at least at this time. Thus, it ought to be disabled by default (at least for now).

Comment 23 Tom Horsley 2014-06-26 16:29:55 UTC
Or removed completely. I can't understand why 99.9999999% of all apps are perfectly able to write to disk without using fdatasync yet somehow the mime database creation tool deems it absolutely essential?

The heck with performance being more important (which it is), why should fdatasync ever be involved for any reason? I can't begin to imagine any reason for it.

Comment 24 Daniel Kian Mc Kiernan 2014-06-26 16:47:16 UTC
fdatasync was involved after real-world problems were observed. I very much doubt that this is the place to revisit that decision, especially as there are multiple ways of mitigating the problem that so provokes us here, without such a revisitation.

Comment 25 Rex Dieter 2014-06-26 16:55:12 UTC
Do you have any details about these real-world problems?

Lacking such details...

Given that this output can be regenerated easily in < 1 second, I find fdatasync to be a great over-reaction...

Comment 26 Daniel Kian Mc Kiernan 2014-06-26 17:33:32 UTC
It's an over-reaction _here_.  And no one seems actually to dispute that it's an over-reaction here -- M. Nocera simply seems to feel that an extended period of suffering will, uhm, build character? amongst Fedora users -- so I'm not going to dig into that history.  If you feel that the problem is _deeper_, then a more general bug report needs to be opened.

Comment 27 Rex Dieter 2014-07-02 18:42:45 UTC
FESCo decided in today's meeting not to override the default behavior here,
https://fedorahosted.org/fesco/ticket/1318
So, I'll prepare an update to comply with that policy decision.

Comment 28 Fedora Update System 2014-07-02 18:47:15 UTC
shared-mime-info-1.2-7.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/shared-mime-info-1.2-7.fc20

Comment 29 Kevin Kofler 2014-07-02 20:46:52 UTC
According to the FESCo ticket, FESCo decided *not to require* overriding the default behavior, which is different from *requiring to not* override it.

Comment 30 Daniel Kian Mc Kiernan 2014-07-02 20:49:09 UTC
In some other context, that distinction would be very important; but, in this context, proceeding as if FESCo has green-lighted an over-ride would lead mostly to heartache.

Comment 31 Tom Horsley 2014-07-02 20:54:02 UTC
So, since I can't actually interpret all the double negatives here, can  anyone tell me what system wide environment variable I want set to which value at all times to make sure this idiocy doesn't ever call fdatasync?

Comment 32 Rex Dieter 2014-07-03 00:04:53 UTC
Explicitly setting env var PKGSYSTEM_ENABLE_FSYNC=0 will disable it.

Comment 33 Fedora Update System 2014-07-03 04:04:09 UTC
Package shared-mime-info-1.2-7.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing shared-mime-info-1.2-7.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-8002/shared-mime-info-1.2-7.fc20
then log in and leave karma (feedback).

Comment 34 Harald Reindl 2014-07-03 12:13:12 UTC
BTW: without explicit "nobarrier" on a otherwise hyperfast ext4 RAID10 this problem exists, mounting rootfs with "nobarrier" and the hammering on the disks goes away (it took long to realize that filesystem barrieres on the only machine without a UPS is the reason for the big difference)

Comment 35 Ahmad Samir 2014-07-04 08:36:49 UTC
Thanks for this fix (now I can stop building shared-mime-info locally).

FWIW, if you're using sudo with yum you'll need to explicitly set that env var on the cli:
sudo PKGSYSTEM_ENABLE_FSYNC=0 yum install foo

just setting the env var in /ec/profile.d/* doesn't affect sudo.

Comment 36 Tom Horsley 2014-07-04 12:15:27 UTC
Or you can add that env var to the env_keep setting in /etc/sudoers:

tomh> sudo printenv | fgrep SYNC
PKGSYSTEM_ENABLE_FSYNC=0

Ta-Da!

Comment 37 Fedora Update System 2014-07-04 12:30:42 UTC
shared-mime-info-1.2-7.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 38 Rex Dieter 2014-07-30 01:24:48 UTC
*** Bug 1124021 has been marked as a duplicate of this bug. ***

Comment 39 Christian Tosta 2016-05-11 04:20:46 UTC
This bug is back in Fedora 23. Using DNF for system updates.

[root@metallica ~]# rpm -q shared-mime-info
shared-mime-info-1.5-2.fc23.x86_64

[root@metallica ~]# rpm -q dnf
dnf-1.1.8-1.fc23.noarch

Comment 40 Harald Reindl 2016-05-11 11:02:36 UTC
"nobarrier,data=writeback" combined with battery backups for the win....


Note You need to log in before you can comment on or make changes to this bug.