Bug 523698

Summary: Needless incompatibility across distros by DB_HASH
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: rpmAssignee: Panu Matilainen <pmatilai>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: ffesti, herrold, jnovy, msalter, n3npq, pmatilai, yersinia.spiros, zeekec
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-17 12:27:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Kratochvil 2009-09-16 13:29:54 UTC
Description of problem:
Currently mock does not work with different Fedora/EPEL versions in host/guest as described in:
Re: Troubles running F9 mock chroot under F11
https://www.redhat.com/archives/fedora-devel-list/2009-September/msg00583.html

Jeff Johnson suggested to use DB_BTREE instead of current DB_HASH which should have no disadvantages while keep the compatibility acceptable.

Version-Release number of selected component (if applicable):
For example:
rpm-4.7.1-1.fc11.x86_64
vs.
rpm-4.4.2.3-9.el5.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. mock -r epel-5-x86_64 --init
2. mock -r epel-5-x86_64 --shell
3. rpm -q rpm

Actual results:
rpmdb: /var/lib/rpm/Packages: unsupported hash version: 9
error: cannot open Packages index using db3 - Invalid argument (22)
error: cannot open Packages database in /var/lib/rpm
package rpm is not installed

Expected results:
rpm-4.4.2.3-9.el5.x86_64

Additional info:
Tried to switch DB_HASH to DB_BTREE and it really works great.

$ file /var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/Packages 
/var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/Packages: Berkeley DB (Hash, version 9, native byte-order)

Patch /usr/lib/rpm/macros by:
--- /usr/lib/rpm/macros-orig	2009-07-24 07:07:12.000000000 +0200
+++ /usr/lib/rpm/macros	2009-09-16 15:05:18.000000000 +0200
@@ -598,7 +598,7 @@ print (t)\
 %__dbi_other			%{?_tmppath:tmpdir=%{_tmppath}} %{?__dbi_cdb}
 
 # Note: adding nofsync here speeds up --rebuilddb a lot.
-%__dbi_rebuild			nofsync !log !txn !cdb
+%__dbi_rebuild			nofsync !log !txn !cdb btree
 %__dbi_transient		%{__dbi_rebuild} temporary private
 %__dbi_perms			perms=0644
 

mv /var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/Packages /tmp/rpmdb; rm /var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/*; mv /tmp/rpmdb /var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/Packages

rpm -r /var/lib/mock/fedora-9-x86_64/root/ --rebuilddb

file /var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/Packages 
/var/lib/mock/fedora-9-x86_64/root/var/lib/rpm/Packages: Berkeley DB (Btree, version 9, native byte-order)

Patch /usr/lib/rpm/macros instead by:
--- /usr/lib/rpm/macros-orig	2009-07-24 07:07:12.000000000 +0200
+++ /usr/lib/rpm/macros	2009-09-16 15:06:19.000000000 +0200
@@ -638,7 +638,7 @@ print (t)\
 %_dbi_tags	Packages:Name:Basenames:Group:Requirename:Providename:Conflictname:Obsoletename:Triggername:Dirnames:Requireversion:Provideversion:Installtid:Sigmd5:Sha1header:Filedigests:Depends:Pubkeys
 
 # "Packages" should have shared/exclusive fcntl(2) lock using "lockdbfd".
-%_dbi_config_Packages		%{_dbi_htconfig} lockdbfd
+%_dbi_config_Packages		%{_dbi_btconfig} lockdbfd
 
 # "Depends" is a per-transaction cache of known dependency resolutions.
 %_dbi_config_Depends		%{_dbi_htconfig} temporary private

mock -r fedora-9-x86_64 --shell
inside:
rpm --rebuilddb
rpm -q bash
bash-3.2-23.fc9.x86_64

Or somehow similar this way, this was just a proof of concept it works from both environments flawlessly.

This would be a backport of an already fixed bug in rpm5.org as suggested by Jeff Johnson.  Jeff Johnson reports that the DB_HASH incompatibility was introduced by db-4.6.x and that switching Packages to DB_BTREE has no measurable performance deficiency.

Unaware if all the databases can be switched to DB_BTREE with no performance regression, it would be very useful for the mock environment to even have no need to do rpm --rebuilddb very often.

Comment 1 Elia Pinto 2009-09-16 16:20:42 UTC
For RPM 4.4.2 the backport was already rejected upstream

https://bugzilla.redhat.com/show_bug.cgi?id=464752

Comment 2 Jan Kratochvil 2009-09-16 16:40:13 UTC
(In reply to comment #1)
> For RPM 4.4.2 the backport was already rejected upstream
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=464752  

That Bug 464752 was about the __db.* files.  This Bug is about the Packages file.
Different problem.

Bug 464752 is more easily workarounded which I do in cron mock updates by:
rpm -r $i/root --rebuilddb
(`rm -f $i/root/var/lib/rpm/__db.*' could possibly be enough, I do not know.)

I was going to submit it next but thanks for the notice it was already WONTFIXed.

Comment 3 Jeff Johnson 2009-09-16 16:58:46 UTC
The symptoms are different but it is not a different problem.

The whole issue is that Berkeley DB provides backward but not forward
compatibility.

So when you have a mixture of Berkeley DB versions, incompatibilities arise.

The problem in #464752 is quite straightforward, there's a version stamp
in a file which causes an error return (EINVAL in older releases, DB_VERSION_MISMATCH
in newer). The patch automates the corrective action, removing the file that
has the wrong version stamp within.

The problem here is that the hash version changed, and switching to DB_BTREE
instead of DB_HASH avoids the problem (note that there is a non-trivial one time
cost switching from DB_HASH -> DB_BTREE everywhere).

There's a few more details ensuring that rpm itself can open an rpmdb
transparently, dealing with both DB_BTREE and DB_HASH as found, not
as configured.

But the fundamental problem here and in #464752 is the same, ensuring
transparent interoperation when there are multiple versions of Berkeley DB
accessing a single rpmdb.

Comment 4 Panu Matilainen 2009-09-17 12:27:59 UTC
That btree happens to work here is just getting lucky with the format not changing across these particular versions, not because it's somehow inherently "more compatible" than hashes. Btree is versioned just like hash is and can change incompatibly in any new BDB version.

WONTFIX - rpm might switch to btree by default for other reasons (such as potentially better performance) at some point but not because of false hopes of better compatibility. Like Jeff points out, there are numerous things to take care of besides just changing the default configuration, and while a future rpm version might be able to deal with on-the-fly btree/ht detection/conversion, there's little chance that such code would end up in existing RHEL and even less chance for EOL Fedora version.

Of course you're free to configure your own systems and chroots to use btree instead of hash while the luck with compatibility lasts.

Comment 5 Jeff Johnson 2009-09-17 12:41:42 UTC
Lucky? Not using DB_HASH because it has a known incompatibility is "lucky"?

Sure all the formats are version'ed, and can change whenever is necessary.
That's also true for EPM: surely you should have changed the version format
when you decided to use SHA256 rather than MD5 in *.rpm packages.
But perhaps you just got "lucky".

I pointed out that there is a one-time cost in converting. Well duh.

I also pointed out an another nicety that is "optional".

But go ahead, cite me to claim WONTFIX for a known to work change
that avoids a luser incompatibility.

Have fun!

Comment 6 Jan Kratochvil 2009-09-17 13:21:07 UTC
(In reply to comment #4)
> WONTFIX - 

> rpm might switch to btree by default for other reasons (such as
> potentially better performance)

Expecting hash was chosen because rpm does not need to traverse the entries in sorted order.  In such case btree is slower (O(log(n)) than hash (O(1)).
It is just the current luck of better compatibility that may be worth the change (while the performance degradation may not be measurable).


> while a future rpm version might be able to deal with on-the-fly btree/ht
> detection/conversion, there's little chance that such code would end up in
> existing RHEL and even less chance for EOL Fedora version.

This is invalid argument.  Current (F12) db4 btree is still compatible with existing epel-4 btree format.  I did file this Bug for Rawhide, not for F9 or RHEL4.  rpm change for F13 was the intended target of this Bug which would ease the epel-4 maintenance already in several months.


> Of course you're free to configure your own systems and chroots to use btree
> instead of hash while the luck with compatibility lasts.  

I already do workaround rpm4 by regular --rebuilddb (Bug 464752) and occasional db*_{dump,load} (this Bug).  Suggesting workarounds is not the goal of a package maintainer assignment.

Comment 7 Jeff Johnson 2009-09-17 14:11:11 UTC
Re comment #6:

Actually, the reason for DB_HASH is hysterical, not O(1) performance.
db-1.85 did not have a btree implementation a decade ago.

Citing O(1) or O(log(n)), while true, misses real world issues. E.g.
RPM must lookup file paths in two indices because the data is
not rationally indexed, and a string beginning with '/' might be in
either the Providename or the Basenames table. RPM is often forced to do sequential
access (true for rpm -qa e.g.), and does too many redundant
accesses.

The above issues largely obliterate any performance benefit from using
DB_HASH or DB_BTREE.

Its rather easy to do the benchmarks, just add --stats to any RPM command
and compare using DB_BTREE and DB_HASH. I did due diligence when
I switched from DB_HASH to DB_BTREE @rpm5.org and there was no
measurable performance gain from using either DB_BTREE or DB_HASH.

There are other performance gains from improved access on certain
paths. E.g. rpm-5.2.0 @rpm5.org has a measured (with callgrind and --stats)
14.6x speed-up by changing perhaps 50 lines of code on path lookups.

But clearly I got "lucky" and just guessed which lines of code to change.