220955 – RPM query and update modes TOO slow

Bug 220955 - RPM query and update modes TOO slow

Summary: RPM query and update modes TOO slow

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	rpm
Sub Component:
Version:	3.8
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Panu Matilainen
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-12-29 16:05 UTC by Joshua Weage
Modified:	2007-11-17 01:14 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-10-11 11:26:57 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Joshua Weage 2006-12-29 16:05:37 UTC

Description of problem:

I have a repository of RPM's used to install various commercial software onto
our Linux workstations.  A script which runs "rpm -U" on every file runs
extremely slow.  It seems as though rpm reads the entire file before determining
if it actually needs to be updated.  I also tried "rpm -q -p" and it runs at
about the same speed.

A 670 MB rpm file takes rpm about 35 seconds to process.  Why?

My solution to this is to use `basename`, then query rpm using "rpm -q
packagename" to determine if the package is installed or not.  If it isn't
installed, I then run an update on the package.

I don't think this is acceptable behavior for rpm.  When doing an update, rpm
should attempt to determine if the package actually needs to be updated before
caching the entire file.  Same thing for the query.  Is the RPM format really
such that it requires the entire file to determine the package name and version?


Version-Release number of selected component (if applicable):


How reproducible:

Always

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jeff Johnson 2006-12-30 01:06:52 UTC

rpm is likely verifying the header+payload signature or digest to insure that the
package is, indeed, intact.

Add --nodigest --nosignature if you don't want the verification.

NOTABUG

Comment 2 Joshua Weage 2007-01-22 16:17:55 UTC

Thanks for the info.  That in fact does solve the problem.

However, I have two additional comments.

The --nodigest and --nosignature options are only listed under install-options
in the man page.

Why do digest and signature verification before checking to see if the package
actually needs to be installed?  In order to speed up the checking process, I
still need to call rpm twice, once to query the status of a rpm, then again to
install the package if it doesn't exist on the system.

Comment 3 Jeff Johnson 2007-01-22 18:31:26 UTC

rpm tries to implements a "standard" security model, i.e. verify all data before import
and/or export. In this case the data is headers in a local rpmdb, or packages that might
be on cdrom or local disk, and hence "known" outside of rpm commands to be trustworthy.

The mechanism is configurable if you wish so you can disable all digest/signature checks
persistently. See /usr/lib/rpm/macros for details.

Yes you need to query to see if package exists, and then install if the package does not exist.
That is 2 invocations of rpm because there are 2 steps in the process.

Comment 4 Joshua Weage 2007-01-23 22:29:49 UTC

Fair enough.

I just find it unfortunate that invoking rpm in upgrade mode requires an
inordinate amount of time to find out that the package is already installed. 
Fortunately there is a workaround that is relatively easy to implement.

Comment 5 Jeff Johnson 2007-01-23 23:59:49 UTC

What is an "inordinate" amount of time? Add --stats to your query please, that will
put a stopwatch on your query and siplay how long. Here's what I see checking to see
if rpm is installed:

# rpm -q --stats rpm
rpm-4.4.8-0.11.i686.rpm
   total:               1      0.000000 MB      0.043817 secs
   digest:              1      0.028141 MB      0.000403 secs
   signature:           1      0.000000 MB      0.006093 secs
   dbget:               4      0.031981 MB      0.000439 secs

None of those times are what I would consider "inordinate".

Comment 6 Joshua Weage 2007-01-29 01:25:27 UTC

From my first post: "A 670 MB rpm file takes rpm about 35 seconds to process." 
This is on a 100 Mbps network.

I use an internal RPM repository to maintain the software on various Linux
machines.  Some of these RPM's range from 300 MB to 1+ GB in size.  I just
recently implemented an "install updates and new software on boot" init script
and it was taking 5+ minutes just to verify that all of the RPM's were already
installed (5 min = $10 lost time per person) every time these dual boot
workstations were booted into Linux.

I've decided to go ahead and use a single invocation of rpm: "rpm -U
--nosignature --nodigest file.rpm" for all of my custom built RPMs, rather than
use the several step process required otherwise.  This typically completes in
10-15 seconds when no packages require updating.

In order to make the previous method run in a reasonable amount of time, I had
to use three steps:

1) query an rpm file for it's package name, with "--nodigest --nosignature"
2) query to see if that package was already installed
3) install/upgrade the package if it wasn't

This method is slower than the single rpm invocation and I've decided that it
isn't worth the extra time to verify the digest.  The RPM's are unlikely to get
corrupted in my particular situation.

Comment 7 Joshua Weage 2007-01-29 01:32:01 UTC

My use of "inordinate" before didn't mean that I think something is going wrong
with rpm.  35 sec for 670 MB is about right.  Rather, I don't think it should
take 35 seconds for RPM to discover that it doesn't need to actually do anything
with the file.

I would prefer to see rpm do this in update mode: check package version, if up
to date quit, if it needs to be installed THEN check the digest and signature. 
But, as mentioned in a previous post, rpm implements a different security model.

Comment 8 Jeff Johnson 2007-03-14 10:41:22 UTC

There is no security in verifying digest/signature *after* extracting version information from
the header. Any malicious action has already occurred by reading the header to extract version
andf/or dependency information.

Comment 9 Panu Matilainen 2007-10-11 11:26:57 UTC

As explained by Jeff Johnson, rpm is behaving as intended (signature checking is
expensive) and the checks can be disabled either by cli options or configuration
through macros.

Note You need to log in before you can comment on or make changes to this bug.