Description of problem: I have a repository of RPM's used to install various commercial software onto our Linux workstations. A script which runs "rpm -U" on every file runs extremely slow. It seems as though rpm reads the entire file before determining if it actually needs to be updated. I also tried "rpm -q -p" and it runs at about the same speed. A 670 MB rpm file takes rpm about 35 seconds to process. Why? My solution to this is to use `basename`, then query rpm using "rpm -q packagename" to determine if the package is installed or not. If it isn't installed, I then run an update on the package. I don't think this is acceptable behavior for rpm. When doing an update, rpm should attempt to determine if the package actually needs to be updated before caching the entire file. Same thing for the query. Is the RPM format really such that it requires the entire file to determine the package name and version? Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
rpm is likely verifying the header+payload signature or digest to insure that the package is, indeed, intact. Add --nodigest --nosignature if you don't want the verification. NOTABUG
Thanks for the info. That in fact does solve the problem. However, I have two additional comments. The --nodigest and --nosignature options are only listed under install-options in the man page. Why do digest and signature verification before checking to see if the package actually needs to be installed? In order to speed up the checking process, I still need to call rpm twice, once to query the status of a rpm, then again to install the package if it doesn't exist on the system.
rpm tries to implements a "standard" security model, i.e. verify all data before import and/or export. In this case the data is headers in a local rpmdb, or packages that might be on cdrom or local disk, and hence "known" outside of rpm commands to be trustworthy. The mechanism is configurable if you wish so you can disable all digest/signature checks persistently. See /usr/lib/rpm/macros for details. Yes you need to query to see if package exists, and then install if the package does not exist. That is 2 invocations of rpm because there are 2 steps in the process.
Fair enough. I just find it unfortunate that invoking rpm in upgrade mode requires an inordinate amount of time to find out that the package is already installed. Fortunately there is a workaround that is relatively easy to implement.
What is an "inordinate" amount of time? Add --stats to your query please, that will put a stopwatch on your query and siplay how long. Here's what I see checking to see if rpm is installed: # rpm -q --stats rpm rpm-4.4.8-0.11.i686.rpm total: 1 0.000000 MB 0.043817 secs digest: 1 0.028141 MB 0.000403 secs signature: 1 0.000000 MB 0.006093 secs dbget: 4 0.031981 MB 0.000439 secs None of those times are what I would consider "inordinate".
From my first post: "A 670 MB rpm file takes rpm about 35 seconds to process." This is on a 100 Mbps network. I use an internal RPM repository to maintain the software on various Linux machines. Some of these RPM's range from 300 MB to 1+ GB in size. I just recently implemented an "install updates and new software on boot" init script and it was taking 5+ minutes just to verify that all of the RPM's were already installed (5 min = $10 lost time per person) every time these dual boot workstations were booted into Linux. I've decided to go ahead and use a single invocation of rpm: "rpm -U --nosignature --nodigest file.rpm" for all of my custom built RPMs, rather than use the several step process required otherwise. This typically completes in 10-15 seconds when no packages require updating. In order to make the previous method run in a reasonable amount of time, I had to use three steps: 1) query an rpm file for it's package name, with "--nodigest --nosignature" 2) query to see if that package was already installed 3) install/upgrade the package if it wasn't This method is slower than the single rpm invocation and I've decided that it isn't worth the extra time to verify the digest. The RPM's are unlikely to get corrupted in my particular situation.
My use of "inordinate" before didn't mean that I think something is going wrong with rpm. 35 sec for 670 MB is about right. Rather, I don't think it should take 35 seconds for RPM to discover that it doesn't need to actually do anything with the file. I would prefer to see rpm do this in update mode: check package version, if up to date quit, if it needs to be installed THEN check the digest and signature. But, as mentioned in a previous post, rpm implements a different security model.
There is no security in verifying digest/signature *after* extracting version information from the header. Any malicious action has already occurred by reading the header to extract version andf/or dependency information.
As explained by Jeff Johnson, rpm is behaving as intended (signature checking is expensive) and the checks can be disabled either by cli options or configuration through macros.