Bug 496242

Summary: deltarpm creation a bit piggish
Product: [Fedora] Fedora Reporter: Bill Nottingham <notting>
Component: deltarpmAssignee: Jonathan Dieter <jonathan>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: bill-bugzilla.redhat.com, dcantrell, jonathan, kdekorte, Marc.Herbert+rhzilla, rvokal, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: mash-0.5.20-1.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-13 05:58:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 473302    

Description Bill Nottingham 2009-04-17 14:02:34 UTC
Description of problem:

 3328 masher    18   0 1582m 889m   80 R 97.6 14.8 137:05.69 makedeltarpm    

process is:

  3328 ?        R    135:28      |                       |       \_ /usr/bin/makedeltarpm /pub/fedora/linux/development/x86_64/debug/kernel-debug-debuginfo-2.6.29.1-70.fc11.x86_64.rpm /mnt/koji/mash/rawhide-20090417/development/x86_64/debug/kernel-debug-debuginfo-2.6.29.1-85.fc11.x86_64.rpm

Needless to say, this leads to things like:

Out of memory: Killed process 3817 (makedeltarpm).

Version-Release number of selected component (if applicable):

deltarpm-3.4-15.fc11.x86_64
createrepo-0.9.7-4.fc11.noarch

How reproducible:

Happened once so far, no reason to think it's not repeatable

Steps to Reproduce:
1. Attempt to compose rawhide
  
Actual results:

HULK SMASH SYSTEM

Expected results:

rawhide composes

Additional info:

Marking as F11 blocker as we need to be able to compose one way or another.

Comment 1 Jonathan Dieter 2009-04-17 14:16:28 UTC
Right now, deltarpm requires between N*2 and N*3 memory where N is the uncompressed size of the RPM.  I've been generating the deltarpms on a server with 1GB of RAM, which means I have the same results you did for kdelibs-apidocs, nexuiz-data, etc.

So, either we need between 2GB and 3GB on any server generating deltarpms for Fedora, or we need to fix deltarpm so that it doesn't read the entire old and new rpms into RAM uncompressed.

I know which of the two is easier, but if that's not going to happen, then I'll see what I can do to fix deltarpm ASAP.

Comment 2 Bill Nottingham 2009-04-17 14:30:23 UTC
The problem is that the rawhide process works in parallel ; it's possible (if not likely) that it will be generating deltas for kernel-debuginfo for all four architectures at once.

Comment 3 Jonathan Dieter 2009-04-17 15:05:57 UTC
Would it be completely crazy to check whether there's 3*uncompressed_size_of_RPM before generating the deltarpm, and if not, block until there is?

Comment 4 Bill Nottingham 2009-04-17 15:48:58 UTC
Given the automated nature of the rawhide compose (and the fact that you'd have to wire this check into createrepo somewhere)... yeah.

Comment 5 Jonathan Dieter 2009-04-17 16:09:41 UTC
Ok, let me see what I can do.  It will probably not be done for today (it's fairly late here), and may not be done by tomorrow.

Comment 6 Jonathan Dieter 2009-04-17 17:11:14 UTC
Right now deltarpm reads the whole old rpm and the whole new rpm into RAM so that it can generate the best delta possible.  My proposed fix is to delta N bytes of the old and new rpms at a time, free it, and then delta the next N bytes.  This would mean that we're only using N*3 memory where *we* specify N.

This does mean that the deltas will suffer if a file is moved from the beginning of the cpio archive to the end of the cpio archive in an rpm update, but it will be worth it to actually be able to generate the deltarpms on computers without gobs of RAM.

So I've looked at the code, and it's going to be pretty difficult.  Either I or upstream will be able to fix it, but I'm not convinced we'll be able to have it done right away.  I'd like to give a timeline, but I'm just not sure what it will take.

I'm sorry, I thought I had communicated deltarpm's current memory requirements, but obviously I hadn't.

Comment 7 Bill Nottingham 2009-04-17 18:53:19 UTC
For now, we're going to go with a switch in createrepo's python API that limits the size of RPMs it will do deltas for. This should cover F11. Moving to a target bug for F12.

Comment 8 Kevin DeKorte 2009-08-26 15:25:50 UTC
A good rpm to test with is vdrift-data at 471MB compressed. For people like me on slow connections (~ 1Mbit) having deltarpms is a big plus for using fedora. But if it can't work on really large data files, that kinda defeats the purpose.

Comment 9 seth vidal 2009-08-26 15:44:22 UTC
Let's be fair - having a solution that works for 98% of the pkgs we have is a damn good start.

I don't think that just because it is not happy with VERY LARGE pkgs that it 'defeats the purpose'.

So, let's have a bit less of the false dichotomy.

thanks

Comment 10 Bill McGonigle 2009-08-26 17:38:28 UTC
I don't know if anybody ever defined a purpose - certainly there was talk about being able to update Fedora for dial-up users (about 35% of the population where I live).  The technique proposed in Comment #6 could possibly rule out that use case, depending on how cpio's are composed.

How many machines are we talking about needing a RAM upgrade to use the code as-is?  8GB of ECC RAM is < $200 these days.  It would be a shame to cut the user base if that number of machines is small.

Forgive my naïveté, but is it not possible to explode to disk in cases where free RAM is insufficient or go one file at a time in RAM?

Comment 11 Jesse Keating 2009-08-26 18:01:23 UTC
I think you misunderstand.  The massive ram requirements are on the host producing the deltas, not necessarily on the client consuming them.

Comment 12 Jonathan Dieter 2009-08-26 18:08:23 UTC
The problem is that different repositories are composed at the same time, so deltarpm is running more than once on the same machine.  Because large packages take longer to create deltas for, the odds are reasonably high that we'll have multiple copies of deltarpm trying to use large amounts of RAM.

There are two possible ways to fix it:
1. Create a buildsystem-like infrastructure dedicated to creating deltarpms
2. Fix deltarpm to use less RAM

(1) isn't going to happen (way, way too much work for way too many people)
(2) is a lot of work for me, and my summer has been busier than expected (just finished helping set up Fedora in the computer room for a sister school in Tyre)

I am going to work on it, but deltarpm's code is going to have to be changed in a number of places, and I still feel like the sorcerer's apprentice whenever I look at it.

Comment 13 Bill McGonigle 2009-08-26 19:41:16 UTC
@Jesse: right, that's why I was asking how many machines need a RAM upgrade to get full deltarpm composes.  But, upon further reflection that's a very Fedora-centric question.  There are users out there who have their own multi-gigabyte RPM's who may benefit greatly from creating their own deltarpms, especially if they're using spacewalk(when it gets deltarpm support) or their own RPM-based distribution systems.

Comment 14 MarcH 2009-10-08 13:33:42 UTC
(In reply to comment #1)
> or we need to fix deltarpm so that it doesn't read the entire old and
> new rpms into RAM uncompressed.

For naive readers of this bug like me, not entangled in the whole Fedora build process and only looking for a solution to delta their big RPMs, I found a working solution here: http://compsoc.dur.ac.uk/~may/rpmdelta/. I will call this solution "rpmmangle" to avoid any confusion.

This rpmmangle solution is basically just combining rpmlib + xdelta. Xdelta seems to have reasonably advanced memory management. Xdelta has gzip support (temporarily uncompressing files) but unfortunately not (anymore?) RPM support. What rpmmangle does is sort-of bringing back RPM support to xdelta, temporarily uncompressing RPMs and passing them to xdelta.

Using rpmmangle + xdelta v1.1.3 I can successfully delta a 1Gigabyte big RPM with only 700Mbytes of RAM. The reconstructed and recompressed RPM even matched the original byte for byte (was I lucky?).

Sorry if the experts already involved in this bug entry knew about all this already; I think random readers might like knowing about a working solution.

Comment 15 Bug Zapper 2010-03-15 12:31:39 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 16 Fedora Update System 2010-05-07 17:53:12 UTC
mash-0.5.16-1.el5 has been submitted as an update for Fedora EPEL 5.
http://admin.fedoraproject.org/updates/mash-0.5.16-1.el5

Comment 17 Fedora Update System 2010-09-28 16:18:16 UTC
mash-0.5.20-1.el5 has been submitted as an update for Fedora EPEL 5.
https://admin.fedoraproject.org/updates/mash-0.5.20-1.el5

Comment 18 Fedora Update System 2010-09-28 18:34:01 UTC
mash-0.5.20-1.el5 has been pushed to the Fedora EPEL 5 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mash'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/mash-0.5.20-1.el5

Comment 19 Fedora Update System 2010-10-13 05:56:37 UTC
mash-0.5.20-1.el5 has been pushed to the Fedora EPEL 5 stable repository.  If problems still persist, please make note of it in this bug report.