Bug 553998 - Many rpm related commands fail due to db4 mismatch
Summary: Many rpm related commands fail due to db4 mismatch
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Panu Matilainen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 555315 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-09 22:00 UTC by Horst H. von Brand
Modified: 2016-01-04 10:40 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-06-27 14:46:17 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Horst H. von Brand 2010-01-09 22:00:38 UTC
Description of problem:
After updating db4 today rpm commands fail

Version-Release number of selected component (if applicable):
rpm-4.8.0-2.fc13.x86_64
db4-4.8.26-1.fc13.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Update db4 to the above
2. rpm -ql rpm
3.
  
Actual results:
rpmdb: Build signature doesn't match environment
error: db3 error(-30971) from dbenv->open: DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db3 -  (-30971)
error: cannot open Packages database in /var/lib/rpm
rpmdb: Build signature doesn't match environment
error: db3 error(-30971) from dbenv->open: DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages database in /var/lib/rpm
package rpm is not installed

Expected results:


Additional info:
Fixed by:

cd /var/lib/rpm
db_recover

Comment 1 Pete Zaitcev 2010-01-11 02:10:04 UTC
Horst is our saviour. However, when I ran db_recover for the first time,
the following happened:

[root@niphredil zaitcev]# cd /var/lib/rpm
[root@niphredil rpm]# db_recover
db_recover: Build signature doesn't match environment
[root@niphredil rpm]# 

But then I re-ran db_recover again and it recovered ok (no output,
rpm works now).

Comment 2 Jeff Johnson 2010-01-13 23:26:42 UTC
DB_VERSION_MISMATCH fixing is as simple as doing
       rm -f /var/lib/rpm/__db*
when encountered. A patch to automate has been sent like
3 times now and rejected. Its not like checking a very specific return
code like DB_VERSION_MISMATCH and performing the recommended
operation once is gonna cause any issues that are not already present.

But YMMV, everyone's does.

re: comment #2
 db_recover is needed/used iff ACID transaction logging is enabled,
not the case with the concurrent access DB_INIT_CDB model used by RPM.
The behavior you are seeing is that (unless -e is given to db_recover)
the 1st execution does
    rm -f /var/lib/rpm/__db*
and exits with a complaint about Build signature. Then the 2nd operation appears to
"work" because there are no transactions logs present. The end result
is exactly the same as if one did
    rm -f /var/lib/rpm/__db*
namely, rpm "works". Helluva way to remove some files, but whatever "works".

Which could be automated. And around and around and around it goes with
no clue needed or desired.

Comment 3 Panu Matilainen 2010-01-14 13:26:12 UTC
*** Bug 555315 has been marked as a duplicate of this bug. ***

Comment 4 Benjamin Ash 2010-02-11 15:29:00 UTC
I can reproduce the exact same issue if I first access the rpm databases with a 64bit application like rpm or yum.  If I then access the same rpm databases with a 32bit library like rpm-python or another 32bit application I get the dreaded DB_VERSION_MISMATCH.

Steps to reproduce:
1) rm -f /var/lib/rpm/__db*
2) access the rpm database with 64bit application or library
3) attempt to access the rpm database with a 32bit application or library
   fails with a DB_VERSION_MISMATCH exception
4) repeat step 1
5) repeat step 3 -> success
6) repeat step 2 -> fail DB_VERSION_MISMATCH

Is this normal behavior?

Thanks,

-ben

Comment 5 Jeff Johnson 2010-02-11 15:57:41 UTC
If you are seeing DB_VERSION_MISMATCH, then you are accessing
/var/lib/rpm with two different versions of Berkeley DB in RPM.

There is no other explanation. 32 <-> 64 has nothing to do with DB_VERSION_MISMATCH.

Meanwhile getting the tools to use a common Berkeley DB version is the best
solution. The workaround is to do
    rm -f /var/lib/rpm/__db*
when DB_VERSION_MISMATCH is encountered. A patch to automate the "rm -f ..."
has been rejected multiple times, but is otherwise quite straightforward.

Comment 6 Benjamin Ash 2010-02-11 16:42:09 UTC
Hi Jeff,

Thanks for the quick reply.

Unfortunately the tools are using the same Berkeley DB versions:

rpm -qf --qf '%{name}-%{version}-%{release}-%{arch}\n' /lib64/libdb-4.3.so /bin/rpm 
db4-4.3.29-10.el5-x86_64
rpm-4.4.2.3-18.el5-x86_64

rpm -qf --qf '%{name}-%{version}-%{release}-%{arch}\n' /lib/libdb-4.3.so /bin/rpm
db4-4.3.29-10.el5-i386
rpm-4.4.2.3-18.el5-i386

The only difference seems to be the architecture.

The following test seems to better demonstrate the issue:
1) create two directories: rpm-el5-32bit and /tmp/rpm-el5-64bit
2) copy the rpm databases from the 32bit run to /tmp/rpm-el5-32bit
3) repeat step 2 for the 64bit test.
4) execute: rpm --dbpath /tmp/rpm-el5-64bit -q foo
rpmdb: Program version 4.3 doesn't match environment version
error: db4 error(-30974) from dbenv->open: DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db3 -  (-30974)
error: cannot open Packages database in /tmp/rpm-el5-64bit
package foo is not installed

5) execute: rpm --dbpath /tmp/rpm-el5-32bit -q foo -> no errors:
"package foo is not installed"

further investigation:
6) execute: file /tmp/rpm-el5-64bit/__db.00*
/tmp/rpm-el5-64bit/__db.000: empty
/tmp/rpm-el5-64bit/__db.001: data
/tmp/rpm-el5-64bit/__db.002: data
/tmp/rpm-el5-64bit/__db.003: data

7) rm /tmp/rpm-el5-64bit/__db.00*
8) repeat step 4 -> success
9) file /tmp/rpm-el5-64bit/__db.00*
/tmp/rpm-el5-64bit/__db.001: data
/tmp/rpm-el5-64bit/__db.002: X11 SNF font data, LSB first
/tmp/rpm-el5-64bit/__db.003: X11 SNF font data, LSB first

Comment 7 Jeff Johnson 2010-02-11 17:09:26 UTC
I stand corrected. You're right.

DB_VERSION_MISMATCH was first added in db-4.3.29:
    http://www.oracle.com/technology/documentation/berkeley-    db/db/programmer_reference/changelog_4_3_29.html

and platform dependence seems to be an alternative cause
of the return code. I'm more used to the other cause of DB_VERSION_MISMATCH
that is reported every time the Berkeley DB version changes.

Older Berkeley DB returned EINVAL rather than the more specific DB_VERSION_MISMATCH.

So ix86 <-> x86_64 interoperability with an existing dbenv is
likely infeasible.

Which leaves the manual work-around
    rm -f /var/lib/rpm/__db*
when DB_VERSION_MISMATCH is encountered since the
patch to automate the corrective action has been rejected multiple times.

Hmmm, there's likely a better way to identify the actual version

Comment 8 Benjamin Ash 2010-02-11 20:47:53 UTC
(In reply to comment #7)
> I stand corrected. You're right.
> 
> DB_VERSION_MISMATCH was first added in db-4.3.29:
>     http://www.oracle.com/technology/documentation/berkeley-   
> db/db/programmer_reference/changelog_4_3_29.html
> 
> and platform dependence seems to be an alternative cause
> of the return code. I'm more used to the other cause of DB_VERSION_MISMATCH
> that is reported every time the Berkeley DB version changes.
> 
> Older Berkeley DB returned EINVAL rather than the more specific
> DB_VERSION_MISMATCH.
> 
> So ix86 <-> x86_64 interoperability with an existing dbenv is
> likely infeasible.
> 
> Which leaves the manual work-around
>     rm -f /var/lib/rpm/__db*
> when DB_VERSION_MISMATCH is encountered since the
> patch to automate the corrective action has been rejected multiple times.
> 
> Hmmm, there's likely a better way to identify the actual version    

Thanks for the confirmation.  I am going with the following work around which is rather ugly but seems to work:

kludgey work around:
1) mkdir /var/lib/rpm-{i386,x86_64}
2) for arch in i386 x86_64; do  (cd /var/lib/rpm-$arch && ln -s ../rpm/[^__]*  .)
3) then in my Python library code set the macro _dbpath to /var/lib/rpm-<arch> if the directory exists.

Perhaps RPM could do same or something similar: 
store the cache files in per-architecture directories:
e.g.
/var/lib/rpm/cache-i386/<cache_files>
/var/lib/rpm/cache-x86_64/<cache_files>

I am not sure how hard this would be though.

Comment 9 Jeff Johnson 2010-02-11 22:04:47 UTC
What are you trying to do? Two databases, one for 32bit, the other
for 64bit, is perhaps not what you need.

The DB_VERSION_MISMATCH is coming from a dbenv
which is shared between applications that access
an rpmdb.

In this case 32/64 bit cannot share, hence DB_VERSION_MISMATCH.

If all you are doing is reading what is in a rpmdb, try running the access
as non-root, i.e. *without* write access to /var/lib/rpm.

If /var/lib/rpm is unwritable (for the application using an rpmdb)
then the rpmdb is opened without using a dbenv, and without sharing
locks. Which is gud enuf for reading on both 32/64 when an rpmdb is not actively
being updated.

There's lots that could be done to improve access. But an rpmdb was
designed for saving what was installed on a single cpu which has
a single architecture, not for simultaneous 32/64 bit shared access.

Comment 10 Benjamin Ash 2010-02-12 14:09:33 UTC
Hi Jeff,

I am only reading from the rpmdb with the 32bit application which needs to be run as root.  If there is a flag to open the rpmdb without a dbenv then I would much prefer to use that.

Thanks,

-ben

Comment 11 Jeff Johnson 2010-02-12 15:10:57 UTC
Even if your 32 bit python application needs to be run
as root, what is needed is to drop privileges while
opening an rpmdb.

That means that you need to create a transaction
in python as early as possible, and undertake some
query that forces a lazy open, with a non-root uid
(technically what is needed is no write permission
to /var/lib/rpm/Packages). Once an rpmdb is opened
without a dbenv, then nothing else in your application
needs to change. An rpmdb is opened only once for
each application, and the rpmdb data structure is cached
persistently in rpmlib for use by all apllications.

Note that opening an rpmdb *without* a dbenv also
means that there is no locking. Which means that there
is a small chance that a reader will see an active writer
updating an rpmdb. The risk there is small, and is
certainly no different than how an rpmdb is opened
for all non-root accesses for years.

Comment 12 Bug Zapper 2010-03-15 13:48:28 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 13 Bug Zapper 2011-06-02 16:57:15 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 14 Bug Zapper 2011-06-27 14:46:17 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 15 Ed 2014-09-24 04:27:53 UTC
I've actually reproduced this in RHEL 5.10

Comment 16 Pavel Šimerda (pavlix) 2016-01-04 10:40:54 UTC
Just encountered the very same issue in Fedora 23 in 2016!

(In reply to Pete Zaitcev from comment #1)
> Horst is our saviour. However, when I ran db_recover for the first time,
> the following happened:
> 
> [root@niphredil zaitcev]# cd /var/lib/rpm
> [root@niphredil rpm]# db_recover
> db_recover: Build signature doesn't match environment
> [root@niphredil rpm]# 
> 
> But then I re-ran db_recover again and it recovered ok (no output,
> rpm works now).

Same behavior. This helped, thanks!

I didn't try the other workaround for obvious reasons.


Note You need to log in before you can comment on or make changes to this bug.