Bug 722987

Summary: as of F15, delta compression of the non-RPM content on install images is (usually) negligible
Product: [Fedora] Fedora Reporter: Andre Robatino <robatino>
Component: squashfs-toolsAssignee: Bruno Wolff III <bruno>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: bruno, dennis, jdieter, srevivo, wwoods
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-21 10:08:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Andre Robatino 2011-07-18 17:11:49 UTC
Description of problem:
During F13 and F14 development, the minimum deltaiso size was about 60-70 MiB. During F15 development, it increased to about 200 MiB, roughly the same as a netinst image. I guessed that this was due to makedeltaiso no longer being able to significantly compress the non-RPM content of the ISO, and delta-compressing between successive netinst images seems to prove it - see

http://lists.fedoraproject.org/pipermail/test/2011-May/100449.html

Note that during the F14 cycle, one could compress by roughly a factor of 3 (which is about the same ratio as between 60-70 MiB and 200 MiB), but for F15 it is negligible. I'm guessing this is due to the change in the structure of the install images referred to in bug 646843 ("images/install.img will no longer exist in F-15 and newer").

Note that I don't actually know if this can be considered a bug - it may be unavoidable given the way the new install images are built. So the first question is, what method does makedeltaiso use to compress non-RPM content?

Version-Release number of selected component (if applicable):
deltaiso-3.6-0.6.20110223git.fc16

How reproducible:
always

Steps to Reproduce:
1. Use makedeltaiso between successive TC/RC netinst images during a development cycle (which BTW can be rebuilt using the archived deltaisos at http://alt.fedoraproject.org/pub/alt/stage/deltaisos/archive/ ). Note that this requires first rebuilding the DVDs, then going from each DVD to the corresponding netinst. Fortunately I have local copies of the disos so don't have to download them.
  
Actual results:
Negligible compression for F15 and later.

Expected results:
Good compression (factor of 3 or so) as in F14 and earlier.

Comment 1 Andre Robatino 2011-07-18 17:12:50 UTC
Sorry about that. Changing to the intended Component (deltarpm).

Comment 2 Andre Robatino 2011-07-21 23:33:18 UTC
Made delta ISOs for the test images in http://dl.fedoraproject.org/pub/alt/stage/ for 20110714/-> 20110721/,  20110721/->20110721-1/, and 20110721-1/->20110721-3/. The size for 20110714/-> 20110721/ was approximately the same size as the corresponding netinst, as expected, but the sizes for the later jumps was much smaller. Any ideas why this would be the case? (I'm saving the deltas and the 20110714/ ISOs so these images can all be rebuilt to test when needed.)

Comment 3 Andre Robatino 2011-07-22 01:21:37 UTC
Compression between successive pairs of netinsts (% of full size):

Fedora-16-test-20110714_20110721-i386-netinst.diso: 87.9%
Fedora-16-test-20110714_20110721-x86_64-netinst.diso: 89.5%

Fedora-16-test-20110721-0_1-i386-netinst.diso: 26.3%
Fedora-16-test-20110721-0_1-x86_64-netinst.diso: 9.5%

Fedora-16-test-20110721-1_3-i386-netinst.diso: 26.0%
Fedora-16-test-20110721-1_3-x86_64-netinst.diso: 9.8%

Comment 4 Andre Robatino 2011-07-22 03:20:51 UTC
Looking at the RPMs in each of the i386 ISOs (the ones containing RPMs, not the netinsts), there are RPM changes between  20110714/ and 20110721/, but none between 20110721/, 20110721-1/, and 20110721-3/. Maybe a change in at least one RPM is associated with some change in the corresponding netinst which prevents delta compression from working well?

Comment 5 James Laska 2011-07-22 11:14:59 UTC
(In reply to comment #4)
> Looking at the RPMs in each of the i386 ISOs (the ones containing RPMs, not the
> netinsts), there are RPM changes between  20110714/ and 20110721/, but none
> between 20110721/, 20110721-1/, and 20110721-3/. Maybe a change in at least one
> RPM is associated with some change in the corresponding netinst which prevents
> delta compression from working well?

The initial ramdisk in the images does change a fair amount too, since it's generated at compose-time (and not packaged).  Perhaps this accounts for differences you are seeing, even when no packages are changed.

Comment 6 Andre Robatino 2011-07-22 15:09:07 UTC
This issue appears to be with delta compression generally, not just with that done by makedeltaiso. Using xdelta on RHEL6, I get the following ratios for the netinsts which are similar to those in comment 3:

20110714->20110721 i386: 87.8%
20110714->20110721 x86_64: 89.1%

20110721->20110721-1 i386: 25.6%
20110721->20110721-1 x86_64: 9.1%

20110721-1->20110721-3 i386: 25.5%
20110721-1->20110721-3 x86_64: 9.1%

I also checked how well rsync did on the same images and never got a speedup of more than 1.00, so there are basically no matching blocks between different netinsts, even closely related ones. So lack of matching blocks is not the issue, since they didn't exist before either.

I'm going to change the Component to "distribution" and hope that someone who understands install images has an idea which Component to use (or if this is a bug at all, though it would be really unfortunate if install images couldn't be made so delta compression worked as well as it used to).

Comment 7 Bill Nottingham 2011-07-22 15:23:10 UTC
I'd suspect squashfs. In any case, CC'ing Will who has worked in this area.

Comment 8 Will Woods 2011-07-22 18:10:33 UTC
The netinst images contain, basically:
- kernel & initrd (~25MB)
- anaconda runtime image (~100-125MB)
- [x86 only] EFI boot images (~25MB)
- assorted metadata, config files, READMEs, GPG keys, etc. (negligible)

In F14 and earlier, the runtime image (install.img) was a squashfs image. I'm guessing these must delta fairly well (although I don't see any actual stats here).

In F15, the runtime image contents were placed in the initrd, which is an xz-compressed cpio archive. And I guess that doesn't delta well.

In current rawhide, the initrd contains a LiveOS-style image (ext4 filesystem image inside xz-compressed squashfs). I'm further guessing that squashfs-with-one-big-file doesn't delta as well as a normal squashfs'd filesystem. 

So.. I don't think it's likely that rsync/deltaiso is going to be very good at generating small deltas for files compressed in this way.

We're working to get anaconda to run from a simple squashfs image as it did before, but this probably won't be finished for F16.

Comment 9 Andre Robatino 2011-07-22 18:29:36 UTC
You can see the delta ISO size for successive TC/RC netinsts for F14 and F15 development as a percentage of full ISO size at

http://lists.fedoraproject.org/pipermail/test/2011-May/100449.html

Chances are the numbers using xdelta would be similar. Anyway, good to know there's a chance the deltas will become smaller again eventually. Thanks for the info.

Comment 10 Andre Robatino 2011-08-11 20:46:45 UTC
FYI, delta compression of non-RPM content is doing much better in 16 so far (ignoring the one from 15 to 16-Alpha.TC1). Here are the latest numbers to compare with those from the previous link.

Fedora-15_16-Alpha.TC1-x86_64-netinst.diso: 265652346 (98.6%)
Fedora-16-Alpha.TC1_RC1-x86_64-netinst.diso: 161720186 (60.0%)
Fedora-16-Alpha.RC1_RC2-x86_64-netinst.diso: 125467358 (46.7%)
Fedora-16-Alpha.RC2_RC3-x86_64-netinst.diso: 54251914 (20.3%)

Comment 11 Andre Robatino 2011-08-17 06:25:38 UTC
I have a complete list of all the netinst->netinst disos (both i386 and x86_64) from 14-Alpha.TC1 onwards at http://robatino.fedorapeople.org/diso_netinst.txt which I will keep updated.

Comment 12 Fedora Admin XMLRPC Client 2011-12-07 19:42:34 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 13 Bruno Wolff III 2011-12-07 19:50:17 UTC
I doubt this issue is directly due to squashfs-tools. There might be something with supporting xz compression indirectly causing an issue.

Comment 14 Andre Robatino 2011-12-07 19:58:12 UTC
I'm not even sure what the latest status of this is. If you look at the latest entries in the link in comment 11, they are much smaller, but that might just be that the changes were small - I don't know if it's a permanent improvement.

Comment 15 Andre Robatino 2012-03-16 17:06:11 UTC
What is the status of this? The delta compression was better during the 16 cycle, but seems worse during 17 (see the updated list at http://robatino.fedorapeople.org/diso_netinst.txt ).

Comment 16 Bruno Wolff III 2012-11-19 14:54:34 UTC
I doubt this is directly a squashfs issue. Getting the full iso sizes down is more important (IMO) than getting the delta sizes down. It might be the way the deltas are done for the ISOs could be adjusted to get better compression.

Comment 17 Andre Robatino 2012-11-19 15:18:50 UTC
(In reply to comment #16)
> I doubt this is directly a squashfs issue. Getting the full iso sizes down
> is more important (IMO) than getting the delta sizes down.

Maybe, but there isn't necessarily a tradeoff. xz compression was introduced in F12, the delta compression got worse in F15, and AFAIK there was no significant reduction in ISO size at that time.

> It might be the
> way the deltas are done for the ISOs could be adjusted to get better
> compression.

Possibly. wwoods, can you provide an update to comment 8 regarding the current status of anaconda?

Comment 18 Bruno Wolff III 2012-11-19 15:24:38 UTC
Squashfs did not have upstream support for lzma/xz until the F15 development time frame. There was a significant (around 10%) improvement in compression for F15 live images, though a good chunk of that was used to balance out increased bloat. At least one image would have needed to cut out packages to make size without this feature.
See: http://fedoraproject.org/wiki/Features/LZMA_for_Live_Images

Comment 19 Andre Robatino 2012-11-19 15:41:55 UTC
OK, but this bug is specifically about install images. Although it does raise the question of delta compression for live images. It would be nice if something like rsync/zsync was useful on live images, so testers could download them faster. I know that currently that's not true. kparal had a blog post on the subject

https://kparal.wordpress.com/2009/09/01/zsync-transfer-large-files-efficiently/

where he compares Ubuntu and Fedora in this respect. Apparently Ubuntu image downloads do speed up well with rsync/zsync. I don't know if Fedora's lives are superior in some other way and delta compression is an unavoidable tradeoff of that.

Comment 20 Bruno Wolff III 2019-05-21 08:32:16 UTC
Is this still an issue? I really doubt the issue is squashfs-tools itself. It may be how it is being used in the process of creating images, but then the bug we be in whatever is doing that.

Comment 21 Andre Robatino 2019-05-21 09:06:53 UTC
I don't know if this still happens, I haven't made delta ISOs since the install DVDs went away. Feel free to close this if you want.

Comment 22 Bruno Wolff III 2019-05-21 10:08:10 UTC
OK. I've closed this.