Red Hat Bugzilla – Bug 220955
RPM query and update modes TOO slow
Last modified: 2007-11-16 20:14:46 EST
Description of problem:
I have a repository of RPM's used to install various commercial software onto
our Linux workstations. A script which runs "rpm -U" on every file runs
extremely slow. It seems as though rpm reads the entire file before determining
if it actually needs to be updated. I also tried "rpm -q -p" and it runs at
about the same speed.
A 670 MB rpm file takes rpm about 35 seconds to process. Why?
My solution to this is to use `basename`, then query rpm using "rpm -q
packagename" to determine if the package is installed or not. If it isn't
installed, I then run an update on the package.
I don't think this is acceptable behavior for rpm. When doing an update, rpm
should attempt to determine if the package actually needs to be updated before
caching the entire file. Same thing for the query. Is the RPM format really
such that it requires the entire file to determine the package name and version?
Version-Release number of selected component (if applicable):
Steps to Reproduce:
rpm is likely verifying the header+payload signature or digest to insure that the
package is, indeed, intact.
Add --nodigest --nosignature if you don't want the verification.
Thanks for the info. That in fact does solve the problem.
However, I have two additional comments.
The --nodigest and --nosignature options are only listed under install-options
in the man page.
Why do digest and signature verification before checking to see if the package
actually needs to be installed? In order to speed up the checking process, I
still need to call rpm twice, once to query the status of a rpm, then again to
install the package if it doesn't exist on the system.
rpm tries to implements a "standard" security model, i.e. verify all data before import
and/or export. In this case the data is headers in a local rpmdb, or packages that might
be on cdrom or local disk, and hence "known" outside of rpm commands to be trustworthy.
The mechanism is configurable if you wish so you can disable all digest/signature checks
persistently. See /usr/lib/rpm/macros for details.
Yes you need to query to see if package exists, and then install if the package does not exist.
That is 2 invocations of rpm because there are 2 steps in the process.
I just find it unfortunate that invoking rpm in upgrade mode requires an
inordinate amount of time to find out that the package is already installed.
Fortunately there is a workaround that is relatively easy to implement.
What is an "inordinate" amount of time? Add --stats to your query please, that will
put a stopwatch on your query and siplay how long. Here's what I see checking to see
if rpm is installed:
# rpm -q --stats rpm
total: 1 0.000000 MB 0.043817 secs
digest: 1 0.028141 MB 0.000403 secs
signature: 1 0.000000 MB 0.006093 secs
dbget: 4 0.031981 MB 0.000439 secs
None of those times are what I would consider "inordinate".
From my first post: "A 670 MB rpm file takes rpm about 35 seconds to process."
This is on a 100 Mbps network.
I use an internal RPM repository to maintain the software on various Linux
machines. Some of these RPM's range from 300 MB to 1+ GB in size. I just
recently implemented an "install updates and new software on boot" init script
and it was taking 5+ minutes just to verify that all of the RPM's were already
installed (5 min = $10 lost time per person) every time these dual boot
workstations were booted into Linux.
I've decided to go ahead and use a single invocation of rpm: "rpm -U
--nosignature --nodigest file.rpm" for all of my custom built RPMs, rather than
use the several step process required otherwise. This typically completes in
10-15 seconds when no packages require updating.
In order to make the previous method run in a reasonable amount of time, I had
to use three steps:
1) query an rpm file for it's package name, with "--nodigest --nosignature"
2) query to see if that package was already installed
3) install/upgrade the package if it wasn't
This method is slower than the single rpm invocation and I've decided that it
isn't worth the extra time to verify the digest. The RPM's are unlikely to get
corrupted in my particular situation.
My use of "inordinate" before didn't mean that I think something is going wrong
with rpm. 35 sec for 670 MB is about right. Rather, I don't think it should
take 35 seconds for RPM to discover that it doesn't need to actually do anything
with the file.
I would prefer to see rpm do this in update mode: check package version, if up
to date quit, if it needs to be installed THEN check the digest and signature.
But, as mentioned in a previous post, rpm implements a different security model.
There is no security in verifying digest/signature *after* extracting version information from
the header. Any malicious action has already occurred by reading the header to extract version
andf/or dependency information.
As explained by Jeff Johnson, rpm is behaving as intended (signature checking is
expensive) and the checks can be disabled either by cli options or configuration