Bug 864516 - use xz compression by default
Summary: use xz compression by default
Keywords:
Status: CLOSED DUPLICATE of bug 700020
Alias: None
Product: Fedora
Classification: Fedora
Component: createrepo
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Packaging Toolset Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-09 13:37 UTC by Kamil Páral
Modified: 2014-01-21 23:24 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-10-10 08:53:41 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Kamil Páral 2012-10-09 13:37:34 UTC
Description of problem:
A lot of Fedora users complain that yum is very slow. A large part of this perceived "slowness" is the time to refresh repository metadata. In order to improve the issue, we should try to make the metadata as small as possible.

createrepo can already create repository metadata using xz compression, but it is not the default. I have made some measurements. This is the size of Fedora 17 master repository metadata:

> 1.9M	./orig/42ec1be1745e71753d57892fefff57fc16d98541662637e1a8f2c0b2f18870f6-comps-f17.xml
> 8.4M	./orig/53f38b1595ef2c93061264cbda1c6015873425768fbb51f84a306b1704409d51-other.xml.gz
> 508K	./orig/63d1bddad9470b822b2a9873cb9a047bf4f2a114222d2d374e10860601b9fc6d-prestodelta.xml.gz
> 9.2M	./orig/7009de56f1a1c399930fa72094a310a40d38153c96d0b5af443914d3d6a7d811-primary.xml.gz
> 7.8M	./orig/a65f7d7fe900ba34acc7cc7c45e728b6c9af5ce5cf0562a0ecc8fafd56d33ef9-other.sqlite.bz2
> 21M	./orig/a9c70025ee9577048b4aea69e1d10cede198546ab342bd508edb6e995d5908d4-filelists.xml.gz
> 436K	./orig/cd6b943c066d5eae4c407ca104128f3dc46ebeb5017f65a47709f299871de21d-comps-f17.xml.gz
> 22M	./orig/ddcb2f6c2ba6ca8e6f47f4a7df96bb920318bdf9991466a12923da38c04966bf-filelists.sqlite.bz2
> 15M	./orig/eda1f9b2d7da63ef28865a5d3d3c9ec8de8f10f8c101f07fea4fb8835c94c514-primary.sqlite.bz2
> 8.0K	./orig/repomd.xml
> 85M	./orig

I have recompressed all gz/bz2 files with "xz --best", here are the results:

> 1.9M	./new/42ec1be1745e71753d57892fefff57fc16d98541662637e1a8f2c0b2f18870f6-comps-f17.xml
> 2.9M	./new/53f38b1595ef2c93061264cbda1c6015873425768fbb51f84a306b1704409d51-other.xml.xz
> 376K	./new/63d1bddad9470b822b2a9873cb9a047bf4f2a114222d2d374e10860601b9fc6d-prestodelta.xml.xz
> 5.3M	./new/7009de56f1a1c399930fa72094a310a40d38153c96d0b5af443914d3d6a7d811-primary.xml.xz
> 5.0M	./new/a65f7d7fe900ba34acc7cc7c45e728b6c9af5ce5cf0562a0ecc8fafd56d33ef9-other.sqlite.xz
> 15M	./new/a9c70025ee9577048b4aea69e1d10cede198546ab342bd508edb6e995d5908d4-filelists.xml.xz
> 268K	./new/cd6b943c066d5eae4c407ca104128f3dc46ebeb5017f65a47709f299871de21d-comps-f17.xml.xz
> 17M	./new/ddcb2f6c2ba6ca8e6f47f4a7df96bb920318bdf9991466a12923da38c04966bf-filelists.sqlite.xz
> 11M	./new/eda1f9b2d7da63ef28865a5d3d3c9ec8de8f10f8c101f07fea4fb8835c94c514-primary.sqlite.xz
> 8.0K	./new/repomd.xml
> 58M	./new

The total repository size went from 85 MB to 58 MB (that's 68% of the original size). We can save 32% of data just by changing the compression algorithm, that's amazing.

I propose to change the default compression algorithm in createrepo from gz/bz2 to xz (using --best variant).

I talked to Zdenek Pavlas (CC'd), yum developer, and he said yum supported .xz files since 2010, so there should be no problems switching to it.

There is one drawback, xz compression is much slower than gz/bz2. It also doesn't support multi-threading at the moment, so multi-cores are not leveraged. Still, repository metadata are created just once on the server, but they are downloaded by thousands of users afterwards. It makes sense to "waste" a few more minutes on the server to decrease the bandwidth and waiting time for all Fedora users out there.

Version-Release number of selected component (if applicable):
createrepo-0.9.9-11.fc17.noarch

Comment 1 Elad Alfassa 2012-10-09 15:31:45 UTC
I'm quite sure this is a duplicate of bug #700020



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 2 Kamil Páral 2012-10-10 08:53:41 UTC
Yes, marking as such.

*** This bug has been marked as a duplicate of bug 700020 ***


Note You need to log in before you can comment on or make changes to this bug.