I experience severe performance differences when reindexing a RPM database with rpm2html, depending of which version of RPM I link with. For example, on a test machine, with dual Athlon MP 1900MHz, when reindexing the RedHat 8.0 RPMS directory, I obtain these values : rpm-4.1-1.06 33.750u 8.420s 1:05.68 64.2% 0+0k 0+0io 484pf+0w rpm-4.1-9 34.100u 9.350s 1:07.06 64.7% 0+0k 0+0io 491pf+0w librpm404-8x.27 11.250u 1.640s 0:12.92 99.7% 0+0k 0+0io 376pf+0w
rpm-4.1 verifies signatures and digests when reading package headers. In particular, the entire payload will be read to verify the traditional header+payload signature/digest. What happens if you recompile rpm2html, adding (void) rpmtsSetVSFlags(ts, -1); immediately after creating a new transaction?
*** Bug 83315 has been marked as a duplicate of this bug. ***
--- rpmopen.c.orig 2003-02-02 08:58:28.000000000 -0500 +++ rpmopen.c 2003-02-02 08:59:20.000000000 -0500 @@ -1016,6 +1016,7 @@ /* read the RPM header */ #if defined(_RPMVSF_NODIGESTS) { rpmts ts = rpmtsCreate(); + (void) rpmtsSetVSFlags(ts, -1); rc = rpmReadPackageFile(ts, fd, buffer, &h); ts = rpmtsFree(ts); }
That works for me, and I obtain performances close to rpm-4.0.4 now. Moreover, I parsed the whole sourceforge archive (RPMS files only) without crashes.
OK, I'm gonna assume you're happy, but warn that you may not have encountered a damaged package. fwiw, rpm-4.1 has stronger sanity checks on header data than rpm-4.0.4, but the Right Thing To Do to prevent segfaults is to check a header sha1 digest before parsing. The digest is present in 8.0 packages and could be cheaply checked by doing rpmtsSetVSFlags(ts, ~RPMVSF_NOSHA1HEADER);