From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.2) (KHTML, like Gecko)
Description of problem:
I've just registered a new RHEL machine. On running up2date for the first time, there are 477Mb of updates available. That'll take me 4 hours to download on our 512kpbs connection (I'm grateful this isn't dialup!).
This update set includes a complete replacement of all the XFree86 packages - 18 in my installation, including XFree86-doc. All this to fix just a handful of bugs in very select parts of the XFree86 installation.
Now, to some extent that's a result of the way the packager of XFree86 has set things up (why should a new release of XFree86-doc be necessary when fixing a bug in the server?). But the implementation of "patch" RPMs would have big benefits:
- Much reduced bandwidth usage; people on dialup might be able to actually download updates (*gasp* !)
- Avoid upgrading RPMs that already work and don't need fixing (as happens e.g. when a new openssh-client is released for no other reason than that openssh-server had a bug fix and the version numbers need to stay in sync); stop violating the principle "don't fix what isn't broken - especially on 'enterprise' production systems!"
- Reduced time to install updates; probably 95 out of every 100 files that RPM touches when doing an up2date haven't actually been changed...
SUSE have had "patch" RPMs implemented for some time. I don't know how they've done it (I'm not even a C coder), but as a sysadmin it seems to me like an obvious and basic feature for package management (our Solaris systems all have it with Sun's package manager).
The simplest way to implement "patch" RPMs would avoid having the concept of a "patch" in the actual RPM database itself. It should be done so that the result of installing a "patch" RPM is indistinguishable from downloading a complete copy of the complete updated RPM and updating to that.
In a first attempt, the patch RPM could contain a list of files that are new, files that should go from the original, and files that have been changed from the original. The payload would only need to contain the changed and new files. A more refined version could contain binary diffs for the changed files, but this requires rpm to have access to the original RPM when creating the patch. The "patch" RPM would also need extra headers - to indicate both the checksums of the "patch" RPM and of the result when the patch is installed over the original. It'd also need to have the original as a hard dependency (hard as in "can't be overridden by --nodeps").
This request seems so obvious that I can't believe it's not been asked for before; couldn't find any open or closed bugs for it for RHEL 3 though. Sorry if I didn't look hard enough!
An obvious place to look for doing this would be to see how SUSE have done it. To give a couple of examples of the bandwidth savings from http://www.suse.com/us/private/download/updates/92_i386.html :
Update RPM: mysql 4.0.21-4.2 (i586): 8326 kB
Patch RPM: mysql 4.0.21-4.2-patch (i586): 53 kB
Update RPM: cups 1.1.21-5.3 (i586): 6760 kB
Patch RPM: cups 1.1.21-5.3-patch (i586): 365 kB
Not all savings are that good, but that's a 14Mb saving for just two updates.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install RHEL3
2. Run up2date
Actual Results: Marvel at the amount of bandwidth/CPU time you have to burn in because of the deficiences of Red Hat's package manager.
Expected Results: Purr and coo as at the efficiency of Red Hat's package manager as it only downloads updates to bits that have been fixed and not whole swathes of unchanged material.
I've now done a more extensive search in Bugzilla; other requests
7121 : 1999-11-18 : WONTFIX
64053 : 2002-04-24 : DEFERRED
103205 : 2003-08-27 : NEEDINFO
The last is interesting as it has a patch from SUSE attached.
Objections seem to cluster around the following:
1. Packagers may not be competent enough to handle the extra
complexity. (My response: I have a suggestion below that has little
complexity - no changes to spec files and no extra skills needed).
2. SUSE's patch for RPM 4 wasn't "well tested" (this is 16 months
ago), and there are extra complexities to deal with. (My response: I
don't know if it is now well tested or not).
3. Philosophical objections to the whole concept. (My response: I
believe philosophical objections would only apply against a "patch"
concept that integrates to RPM at the wrong level).
I have an idea for how to implement patch RPMs that wouldn't
necessitate major re-working of RPM itself; no changes to spec files,
no new skills necessary to learn. Just reduced downloads and CPU
a) Create a new program to generate a "patch RPM"; its input would be
the old RPM, a new RPM, and a list of files that are changed (i.e. in
both RPMs, but different in substance - e.g. as a result of bug fix).
The program would then auto-discover (by examining the RPMs) the list
of files that have been deleted, and those that have been added.
The created "patch" could then have binary diffs also to further save
- Modify rpm installation/updating as follows: if one of the RPMs to
install is a patch, do this:
* Verify that installed package has correct checksums, etc -
otherwise, complain. (This is only necessary if binary diffs are
* Create a new RPM in a temporary directory by combining the
installed RPM and the patch to recreate the original "new" RPM.
* Behave as if the request was to update to the recreated new RPM,
and forget that it originated from a patch.
Under this method, there's little complexity; the update RPM is
re-constructed at an "early" stage - the core RPM software never
needs to know or care that there was originally a patch, or that
patches even exist.
The downside to this method is that there's no way to roll back just
the patch; you'd have to download the original RPM. However, that's
the way it is now in any case, so there's no loss - just not an
Here is conversation on the rpm-devel list that gives some insight
about what Jeff Johnson is thinking of doing regarding patches:
Concerning your last statement, you can rollback rpms presently today
without downloading the version you wish to rollback to. At least
for my company this is a very important feature. That said I believe
it is possible to provide what your looking for without overly
complicated rollback mechanisms.
That said, I am only a hacker in the shadows, and not a RH employee.
Regarding rollback - I ought to read the man page more carefully to
learn how to do it properly!
I removed an rpm with --repackage last week (on Fedora Core 2), but
trying to reinstall it gave errors about signature; I tried to
install without signature or package checksums but it complained
about the checksums of individual files within the package, at which
point I gave up and downloaded the original RPM again to save time in
working out what I should be doing. :-(
Patch packages add a great deal of complexity to package installs
and are unlikely to be able to be used generally or widely.
The far better solution is deltas on *.rpm packages, which can
be done with all existing packages right now.
OK... I'm using up2date to get updates from RHN on RHEL 3 - how do I
get it to do "deltas on *.rpm packages" ?? I'd love to reduce my
500Mb download on a 512kbps line.
In response to #4 I'd also say:
- The complexity is all yours! This is a RHEL bug; to update, we
just run up2date and click a few boxes. Isn't it RH's job to deal
with the complexity rather than make the users suffer? Isn't that
what we pay for?
- If patch RPMs were able to be used by up2date, then they would be
used everywhere where up2date is used - i.e. everywhere where RHEL
I don't really understand how to use "deltas on *.rpm packages" to
update our RHEL 3 install, so if anyone can enlighten me, I'd be
very thankful! Cheers.
I've now read a bit more on this problem... I think I now understand
that by "right now" you don't mean that I can do it right now on
RHEL 3, but that it can be on RPMs that already exist (i.e. they
don't need to be rebuilt). Sorry for my ignorance/timewasting...