Bug 1461765

Summary: [RFE] Call OS sync after transaction
Product: [Fedora] Fedora Reporter: Lukas Zapletal <lzap>
Component: rpmAssignee: Packaging Maintenance Team <packaging-team-maint>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: goodmirek, igor.raits, kardos.lubos, mjw, packaging-team-maint, pmatilai, rpm-software-management, vmukhame
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-10 12:35:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Zapletal 2017-06-15 09:36:16 UTC
Hello,

I got power outage after dnf transaction was completed but before the ext4 default commit limit of 5 seconds. I think calling OS syscall "sync/fsync" can't much hurt and it could prevent from these situations.

A simple core plugin could do it, but this feature is only useful when enabled by default.

Comment 1 Igor Gnatenko 2017-06-21 07:44:01 UTC
dnf doesn't care about what happens after transaction, this is for RPM.

Comment 2 Panu Matilainen 2017-09-06 06:18:01 UTC
Yeah it seems like a no-brainer and mildly hysterical that rpm hasn't done this. Added upstream.

Comment 3 Panu Matilainen 2018-08-10 12:37:07 UTC
Um, it's upstream but not in 4.14.

Comment 4 Igor Raits 2018-08-10 12:51:16 UTC
(In reply to Panu Matilainen from comment #3)
> Um, it's upstream but not in 4.14.

Argh, didn't notice.

Comment 5 Mirek Svoboda 2019-09-26 19:04:32 UTC
Please could you let me know whether this fix is already included in RPM version 4.14.2.1 ?

It happened to me that my computer lost power a few seconds after DNF transaction successfully finished.
Then it failed to boot again and I found multiple zero size files in /lib64.
I am trying to figure out what could have caused that.
Filesystem is XFS, OS is Fedora 30, running kernel 5.3.1.

Comment 6 Mirek Svoboda 2019-09-26 20:49:27 UTC
Looking at https://github.com/rpm-software-management/rpm/blob/rpm-4.14.2.1-release/lib/transaction.c the fix did not make it to 4.14.2.1.
The commit https://github.com/rpm-software-management/rpm/commit/eef82b0e81c4aba4069cbf273b2c14005c9b2331 implementing the fix goes to 4.15.
As a workaround, with all the performance consequences, it is possible to use a macro _flush_io, introduced here: https://github.com/rpm-software-management/rpm/commit/4087530f0fcbb14167be8296957e44e6ffc97579
It has been cherry picked to 4.14 via commit https://github.com/rpm-software-management/rpm/commit/4afad76535ad62ab009b1bec6e7bf714edd6611a
It comes as a surprise to me that the "general sync after transaction" fix has not been backported, while the very special macro _flush_io has been backported.

As I am not familiar with the RPM package manager development process and release cycle, please could you advise whether it make any sense to try to backport this fix to RPM version 4.14.2 ?

Comment 7 Panu Matilainen 2019-09-27 08:49:28 UTC
The feature was (perhaps surprisingly) considered quite controversial and disruptive in the upstream community, and thus no plans to backport to older releases. It's in 4.15 now, so Fedora >= 31 has it.
The major difference to _flush_io is that the latter is opt-in, whereas sync after transaction affects every user.