Bug 1971186 - use fstrim at conclusion of installations
Summary: use fstrim at conclusion of installations
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: lorax
Version: 35
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Brian Lane
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1972376
TreeView+ depends on / blocked
 
Reported: 2021-06-12 19:50 UTC by Chris Murphy
Modified: 2022-05-19 15:49 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-18 23:07:31 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Chris Murphy 2021-06-12 19:50:12 UTC
Description of problem:

There can be various deleted files that were needed for the installation (including downloaded RPMs) that will remain on backing media. This is a problem in particular for images. Consider running fstrim on all /mnt/sysimage file systems at the conclusion of the installation.

Version-Release number of selected component (if applicable):
anaconda-35.16-2.fc35

How reproducible:
Always, at least for images


Steps to Reproduce:

# ls -ls
total 890660
699896 -rw-r--r--. 1 root root 5368709120 Jun 12 13:08 Fedora-Cloud-Base-Rawhide-20210605.n.0.aarch64.raw
# fstrim -v /mnt
/mnt: 593 MiB (621817856 bytes) trimmed
# ls -ls
total 890656
699892 -rw-r--r--. 1 root root 5368709120 Jun 12 13:08 Fedora-Cloud-Base-Rawhide-20210605.n.0.aarch64.raw
# umount /mnt
# mount /dev/mapper/loop0p2 /mnt
# fstrim -v /mnt
/mnt: 3.6 GiB (3906367488 bytes) trimmed
# ls -ls
total 870316
679552 -rw-r--r--. 1 root root 5368709120 Jun 12 13:08 Fedora-Cloud-Base-Rawhide-20210605.n.0.aarch64.raw
# 

Actual results:

~20 MiB of garbage removed when doing fstrim.

Expected results:

Images should be as small as possible prior to compression.

Additional info:

Comment 1 Chris Murphy 2021-06-19 22:04:28 UTC
A more extreme example. Start with this:

325M -rw-r--r--. 1 root root 325M Jun 19 15:26 Fedora-Cloud-Base-Rawhide-20210619.n.0.x86_64.raw.xz

After unxz:
4.1G -rw-r--r--. 1 root root 5.0G Jun 19 15:26 Fedora-Cloud-Base-Rawhide-20210619.n.0.x86_64.raw

Image after losetup->kpartx->mount p2 (btrfs)->fstrim btrfs
412M -rw-r--r--. 1 root root 5.0G Jun 19 15:46 Fedora-Cloud-Base-Rawhide-20210619.n.0.x86_64.raw

Is it appropriate to put this in kickstart as a %post script?

Comment 2 Chris Murphy 2021-06-19 22:38:50 UTC
Set to block "Make btrfs the default file system for Fedora Cloud" just for tracking. The issue affects other file systems and images too, so fixing this would be general purpose.

Comment 3 Chris Murphy 2021-06-19 23:34:17 UTC
https://kojipkgs.fedoraproject.org//packages/Fedora-Cloud-Base/Rawhide/20210619.n.0/data/logs/image/oz-x86_64.log

>necho "Zeroing out empty space."\n# This forces the filesystem to reclaim space from deleted files\ndd bs=1M if=/dev/zero of=/var/tmp/zeros || :\nrm -f /var/tmp/zeros\necho "(Don\'t worry -- that out-of-space error was expected.)"\

Is this obsolete now? I think we're better off replacing it with fstrim instead. Also, this dd command must exclude baremetal installs or they'd take forever zeroing out the media.

Comment 4 Chris Murphy 2021-06-23 01:48:33 UTC
lorax/src/pylorax/installer.py:433:                    # For image installs, run fstrim to discard unused blocks. This way
lorax/src/pylorax/creator.py:558:    rc = execWithRedirect("/usr/sbin/fsck.ext4", ["-y", "-f", "-E", "discard", rootfs_img])

Comment 5 Chris Murphy 2021-06-27 03:04:29 UTC
See also:

creating smaller cloud images
https://pagure.io/cloud-sig/issue/335

QCOW images in recent Fedora-Cloud-Base-Vagrant libvirt boxes for Rawhide are not sparse 
https://pagure.io/cloud-sig/issue/340

Comment 6 Neal Gompa 2021-07-15 08:25:20 UTC
(In reply to Chris Murphy from comment #3)
> https://kojipkgs.fedoraproject.org//packages/Fedora-Cloud-Base/Rawhide/
> 20210619.n.0/data/logs/image/oz-x86_64.log
> 
> >necho "Zeroing out empty space."\n# This forces the filesystem to reclaim space from deleted files\ndd bs=1M if=/dev/zero of=/var/tmp/zeros || :\nrm -f /var/tmp/zeros\necho "(Don\'t worry -- that out-of-space error was expected.)"\
> 
> Is this obsolete now? I think we're better off replacing it with fstrim
> instead. Also, this dd command must exclude baremetal installs or they'd
> take forever zeroing out the media.

This is *definitely* not obsolete. Removing this caused us to go from ~300MB to ~900MB. I'm putting it back and adding a sync in https://pagure.io/fedora-kickstarts/pull-request/824

Comment 7 Chris Murphy 2021-07-16 03:14:33 UTC
>This is *definitely* not obsolete. Removing this caused us to go from ~300MB to ~900MB. I'm putting it back and adding a sync in https://pagure.io/fedora-kickstarts/pull-request/824

It's a mirage, as I explained here: https://pagure.io/cloud-sig/issue/340#comment-743548

It's pointless to write zeros, delete them *and* do fstrim. Pick one. The first one will be a fully allocated image that compresses rather well. The second will be smaller and take much less time to create. And doing both gets you the same size results as the fstrim only option, but with massive write amplification and disk contention for no benefit.

Comment 8 Neal Gompa 2021-07-16 12:51:00 UTC
Fine, I changed to do *just* fstrim and added a sync right after: https://pagure.io/fedora-kickstarts/pull-request/826

Let's see how that goes...

Comment 9 Neal Gompa 2021-07-16 13:48:58 UTC
It does not work: https://koji.fedoraproject.org/koji/taskinfo?taskID=72006902

The result is 840MB!

So I'll switch to the other way, and let's see how that goes...

Comment 10 Neal Gompa 2021-07-17 01:59:27 UTC
It works with the zero method (with no fstrim): https://koji.fedoraproject.org/koji/taskinfo?taskID=72011344

The result is 279MB.

Comment 11 Chris Murphy 2021-07-17 04:12:18 UTC
https://pagure.io/fedora-kickstarts/pull-request/826#request_diff
Sorry for the lack of clarity. fstrim before sync won't work, the sync commits the file deletion to disk. Only once the deletion is committed can fstrim do the correct thing.

Comment 12 Ben Cotton 2021-08-10 13:07:43 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 35 development cycle.
Changing version to 35.

Comment 13 Brian Lane 2021-09-24 23:27:30 UTC
I finally got some time to look into this. First off, the koji createImage task appears to be using Oz so none of what I'm about to type applies :)

I *did* manage to see a slight improvement in size by adding fstrim to livemedia-creator. PR is here:
https://github.com/weldr/lorax/pull/1172


I've added output of the image size before and after fstrim and fallocate --dig-holes, and turned on verbose output for fstrim and fallocate.
With partitioned disk, filesystem image, and live iso using the fedora-minimal.ks from lorax I see the image file that lmc creates shrink:

disk - 237MiB smaller
filesystem - 133MiB smaller
minimal iso (ext4 install.img) - 158MiB smaller
live installer iso - 751Mib smaller

This is measured using du -B1 on the image file before and after running fstrim+fallocate. You can now see these numbers in program.log when running livemedia-creator.

Comment 14 Chris Murphy 2022-05-19 15:49:26 UTC
Looks like I never opened a releng issue to make sure the VM's have discard="unmap" set so that fstrim is effective.

Comment 15 Chris Murphy 2022-05-19 15:49:36 UTC
https://pagure.io/releng/issue/10801


Note You need to log in before you can comment on or make changes to this bug.