Description of problem:
3328 masher 18 0 1582m 889m 80 R 97.6 14.8 137:05.69 makedeltarpm
3328 ? R 135:28 | | \_ /usr/bin/makedeltarpm /pub/fedora/linux/development/x86_64/debug/kernel-debug-debuginfo-126.96.36.199-70.fc11.x86_64.rpm /mnt/koji/mash/rawhide-20090417/development/x86_64/debug/kernel-debug-debuginfo-188.8.131.52-85.fc11.x86_64.rpm
Needless to say, this leads to things like:
Out of memory: Killed process 3817 (makedeltarpm).
Version-Release number of selected component (if applicable):
Happened once so far, no reason to think it's not repeatable
Steps to Reproduce:
1. Attempt to compose rawhide
HULK SMASH SYSTEM
Marking as F11 blocker as we need to be able to compose one way or another.
Right now, deltarpm requires between N*2 and N*3 memory where N is the uncompressed size of the RPM. I've been generating the deltarpms on a server with 1GB of RAM, which means I have the same results you did for kdelibs-apidocs, nexuiz-data, etc.
So, either we need between 2GB and 3GB on any server generating deltarpms for Fedora, or we need to fix deltarpm so that it doesn't read the entire old and new rpms into RAM uncompressed.
I know which of the two is easier, but if that's not going to happen, then I'll see what I can do to fix deltarpm ASAP.
The problem is that the rawhide process works in parallel ; it's possible (if not likely) that it will be generating deltas for kernel-debuginfo for all four architectures at once.
Would it be completely crazy to check whether there's 3*uncompressed_size_of_RPM before generating the deltarpm, and if not, block until there is?
Given the automated nature of the rawhide compose (and the fact that you'd have to wire this check into createrepo somewhere)... yeah.
Ok, let me see what I can do. It will probably not be done for today (it's fairly late here), and may not be done by tomorrow.
Right now deltarpm reads the whole old rpm and the whole new rpm into RAM so that it can generate the best delta possible. My proposed fix is to delta N bytes of the old and new rpms at a time, free it, and then delta the next N bytes. This would mean that we're only using N*3 memory where *we* specify N.
This does mean that the deltas will suffer if a file is moved from the beginning of the cpio archive to the end of the cpio archive in an rpm update, but it will be worth it to actually be able to generate the deltarpms on computers without gobs of RAM.
So I've looked at the code, and it's going to be pretty difficult. Either I or upstream will be able to fix it, but I'm not convinced we'll be able to have it done right away. I'd like to give a timeline, but I'm just not sure what it will take.
I'm sorry, I thought I had communicated deltarpm's current memory requirements, but obviously I hadn't.
For now, we're going to go with a switch in createrepo's python API that limits the size of RPMs it will do deltas for. This should cover F11. Moving to a target bug for F12.
A good rpm to test with is vdrift-data at 471MB compressed. For people like me on slow connections (~ 1Mbit) having deltarpms is a big plus for using fedora. But if it can't work on really large data files, that kinda defeats the purpose.
Let's be fair - having a solution that works for 98% of the pkgs we have is a damn good start.
I don't think that just because it is not happy with VERY LARGE pkgs that it 'defeats the purpose'.
So, let's have a bit less of the false dichotomy.
I don't know if anybody ever defined a purpose - certainly there was talk about being able to update Fedora for dial-up users (about 35% of the population where I live). The technique proposed in Comment #6 could possibly rule out that use case, depending on how cpio's are composed.
How many machines are we talking about needing a RAM upgrade to use the code as-is? 8GB of ECC RAM is < $200 these days. It would be a shame to cut the user base if that number of machines is small.
Forgive my naïveté, but is it not possible to explode to disk in cases where free RAM is insufficient or go one file at a time in RAM?
I think you misunderstand. The massive ram requirements are on the host producing the deltas, not necessarily on the client consuming them.
The problem is that different repositories are composed at the same time, so deltarpm is running more than once on the same machine. Because large packages take longer to create deltas for, the odds are reasonably high that we'll have multiple copies of deltarpm trying to use large amounts of RAM.
There are two possible ways to fix it:
1. Create a buildsystem-like infrastructure dedicated to creating deltarpms
2. Fix deltarpm to use less RAM
(1) isn't going to happen (way, way too much work for way too many people)
(2) is a lot of work for me, and my summer has been busier than expected (just finished helping set up Fedora in the computer room for a sister school in Tyre)
I am going to work on it, but deltarpm's code is going to have to be changed in a number of places, and I still feel like the sorcerer's apprentice whenever I look at it.
@Jesse: right, that's why I was asking how many machines need a RAM upgrade to get full deltarpm composes. But, upon further reflection that's a very Fedora-centric question. There are users out there who have their own multi-gigabyte RPM's who may benefit greatly from creating their own deltarpms, especially if they're using spacewalk(when it gets deltarpm support) or their own RPM-based distribution systems.
(In reply to comment #1)
> or we need to fix deltarpm so that it doesn't read the entire old and
> new rpms into RAM uncompressed.
For naive readers of this bug like me, not entangled in the whole Fedora build process and only looking for a solution to delta their big RPMs, I found a working solution here: http://compsoc.dur.ac.uk/~may/rpmdelta/. I will call this solution "rpmmangle" to avoid any confusion.
This rpmmangle solution is basically just combining rpmlib + xdelta. Xdelta seems to have reasonably advanced memory management. Xdelta has gzip support (temporarily uncompressing files) but unfortunately not (anymore?) RPM support. What rpmmangle does is sort-of bringing back RPM support to xdelta, temporarily uncompressing RPMs and passing them to xdelta.
Using rpmmangle + xdelta v1.1.3 I can successfully delta a 1Gigabyte big RPM with only 700Mbytes of RAM. The reconstructed and recompressed RPM even matched the original byte for byte (was I lucky?).
Sorry if the experts already involved in this bug entry knew about all this already; I think random readers might like knowing about a working solution.
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.
More information and reason for this action is here:
mash-0.5.16-1.el5 has been submitted as an update for Fedora EPEL 5.
mash-0.5.20-1.el5 has been submitted as an update for Fedora EPEL 5.
mash-0.5.20-1.el5 has been pushed to the Fedora EPEL 5 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update mash'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/mash-0.5.20-1.el5
mash-0.5.20-1.el5 has been pushed to the Fedora EPEL 5 stable repository. If problems still persist, please make note of it in this bug report.