Bug 106459

Summary: RPM database corrupted, unable to recover
Product: [Retired] Red Hat Linux Reporter: Andre Costa <acosta>
Component: rpmAssignee: Jeff Johnson <jbj>
Status: CLOSED WORKSFORME QA Contact: Mike McLean <mikem>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: bugs, redhat
Target Milestone: ---   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-12-27 21:22:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
apt-get error
none
output from rpm --rebuilddb -vv
none
rpm -qa output after database crash
none
output of rpm -qavv none

Description Andre Costa 2003-10-07 12:23:49 UTC
From Bugzilla Helper:
User-Agent: Opera/7.20 (X11; Linux i686; U)  [en]

Description of problem:
apt-get crashed during routine upgrade (I do it on a daily basis), with these error messages:

[snip]
rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from dbcursor->c_put: DB_RUNRECOVERY: Fatal error, run database 
recovery
rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from db->sync: DB_RUNRECOVERY: Fatal error, run database recovery 
rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from dbcursor->c_close: DB_RUNRECOVERY: Fatal error, run database 
recovery
rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from db->sync: DB_RUNRECOVERY: Fatal error, run database recovery 
rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from db->cursor: DB_RUNRECOVERY: Fatal error, run database 
recovery
rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from db->get: DB_RUNRECOVERY: Fatal error, run database recovery
[snip]

(full output attached)

There are no stale locks (__db*) on /var/lib/rpm. I tried rebuilding the database with rpm 
--rebuilddb but it segfaults. I tried doing:

    cd /var/lib/rpm
    mv Packages Packages-ORIG
    /usr/lib/rpm/rpmdb_dump Packages-ORIG | /usr/lib/rpm/rpmdb_load Packages
    rpm --rebuilddb -vv

But it still segfaults, always at the same place:

[snip]
D: adding "openssl096b" to Name index.
D: adding 14 entries to Basenames index.
D: adding "System Environment/Libraries" to Group index.
D: adding 15 entries to Requirename index.
D: adding 3 entries to Providename index.
D: adding "openssl" to Conflictname index.
D: adding 3 entries to Dirnames index.
D: adding 15 entries to Requireversion index.
D: adding 3 entries to Provideversion index.
D: adding 1 entries to Installtid index.
D: adding 1 entries to Sigmd5 index.
D: adding "fd18c27c0377beec5d4bc71522ed0e20e7a73973" to Sha1header index.
D: adding 14 entries to Filemd5s index.
Segmentation fault

(full output attached)

RPM database seems to be ruined, and I am unable to recover from this. rpm -qa produces a lot 
of these:

error: rpmdbNextIterator: skipping h#    1694 Header V3 DSA signature: BAD, key ID e42d547b

and ends up with:

[snip]
openssl096b-0.9.6b-12
Segmentation fault

Any help will be much appreciated.

TIA

Andre

Version-Release number of selected component (if applicable):
rpm-4.2-0.69

How reproducible:
Always

Steps to Reproduce:
1. rpm -qa
2.
3.
    

Actual Results:  [snip]
perl-DBD-MySQL-2.1021-3
error: rpmdbNextIterator: skipping h#    1694 Header V3 DSA signature: BAD, key ID e42d547b
gkrellm-2.1.19-1.fr
openssh-clients-3.5p1-11
perl-CPAN-1.61-88.3
openssl096b-0.9.6b-12
Segmentation fault

Expected Results:  (full package listing, no error messages, no segfault)

Additional info:

Using non-RH kernel (2.4.22 compiled from tarball). Many RPMs installed are not from RH 
(Freshrpms, yum or built from SRPMs).

Comment 1 Andre Costa 2003-10-07 12:26:33 UTC
Created attachment 94979 [details]
apt-get error

This output is not the original one, but instead one attempt made after the
database was already compromised

Comment 2 Andre Costa 2003-10-07 12:27:33 UTC
Created attachment 94980 [details]
output from rpm --rebuilddb -vv

Comment 3 Andre Costa 2003-10-07 12:29:38 UTC
Created attachment 94981 [details]
rpm -qa output after database crash

Comment 4 Andre Costa 2003-10-07 12:38:26 UTC
The database itself doesn't seem to be corrupted:

/usr/lib/rpm/rpmdb_stat -d Packages
61561   Hash magic number.
8       Hash version number.
Flags:
4096    Underlying database page size.
0       Specified fill factor.
654     Number of keys in the database.
1       Number of data items in the database.
4       Number of hash buckets.
2553    Number of bytes free on bucket pages (84% ff).
5099    Number of overflow pages.
1359266 Number of bytes free in overflow pages (93% ff).
0       Number of bucket overflow pages.
0       Number of bytes free in bucket overflow pages (0% ff).
0       Number of duplicate pages.
0       Number of bytes free in duplicate pages (0% ff).
0       Number of pages on the free list.

'rpmdb_verify Packages' exits with 0 status.

Could it be a rpm error instead of a db4 one?

Comment 5 Jeff Johnson 2003-10-07 16:59:58 UTC
What does rpm -qavv say? I need the header instance displayed,
look also for BAD. You do have all the keys imported, don't you?

Comment 6 Andre Costa 2003-10-07 17:09:45 UTC
Hi Jeff, thks for lending a hand. I am attaching the output of rpm -qavv as requested. Regarding 
keys, probably not, I don't recall ever doing this manually (i.e. if apt-get or yum didn't do it, then 
they're probably missing). Could this cause a segfault? How do I import the needed keys? (feel 
free to direct me to any existing info/howto about this).

Best,

Andre

Comment 7 Andre Costa 2003-10-07 17:10:25 UTC
Created attachment 94991 [details]
output of rpm -qavv

Comment 8 Andre Costa 2003-10-07 20:25:08 UTC
I was "brave" (stupid) enough to try 'apt-get update/upgrade' at my office 
computer, and the same probl happened (similar configuration -- RH9, kernel 2.4.
22 compiled from tarball). Maybe latest packages on freshrpms.net could be a 
starting point for investigation. Latest packages on my /var/cache/apt/archives 
are:

nmap-frontend_2%3a3.48-1.fr_i386.rpm
nmap_2%3a3.48-1.fr_i386.rpm
mplayer_1.0-0.2.pre2.fr_i386.rpm
libpostproc_1.0-0.2.pre2.fr_i386.rpm
id3lib_3.8.3-4.fr_i386.rpm
alsa-driver_0.9.7-1.fr_i386.rpm
alsa-utils_0.9.7-1.fr_i386.rpm
alsa-lib_0.9.7-1.fr_i386.rpm
yum_2.0.3-5.rh.fr_i386.rpm
rpm-devel_4.2-0.69_i386.rpm
mozilla_35%3a1.2.1-26_i386.rpm

Hope this helps understand what happened.

Andre


Comment 9 Andre Costa 2003-10-08 15:07:09 UTC
Problem is definitely related to signature handling by RPM. If I run 'rpm -qa --nosignature | wc -l' I 
get 653 packages, and no error message whatsoever. On the other hand, if I run 'rpm -qa 2>&1 | 
grep -v "^error" | wc -l', I get only 164 packages and the dreadful segfault (output from last 
command reports 16 error messages).

Actually, it seems all commands run fine with --nosignature:

~ rpm -q mplayer
error: rpmdbNextIterator: skipping h#    1694 Header V3 DSA signature: BAD, key ID e42d547b
Segmentation fault
~ rpm -q --nosignature mplayer
mplayer-1.0-0.2.pre2.fr

Questions:

1. aside from potential malicious packages, it is safe to use --nosignature, right?
2. assuming [1] is ok, how do I put it on my ~/.rpmmacros (or maybe on /usr/lib/rpm/macros, 
since root RPM operations will fail as well)?

TIA

Andre

Comment 10 Jeff Johnson 2003-10-09 14:53:54 UTC
OK, rpm -qavv shows every header signed with key id e42d547b
not verifying.

Try removing that key:
    rpm -evv gpg-pubkey-e42d547b
and repeat rpm -qavv.

If you got that key for a key server, then you will need
to import the key using gpg, and delete the other signatures.

Does that help?

Comment 11 Andre Costa 2003-10-09 17:38:14 UTC
Here at work (not the same computer I took original reports from, but still one 
suffering from the same problems), I have this:

rpm -qavv 2>&1 | grep BAD
error: rpmdbNextIterator: skipping h#     729 Header V3 DSA signature: BAD, key 
ID e42d547b
error: rpmdbNextIterator: skipping h#      92 Header V3 DSA signature: BAD, key 
ID 00000000
error: rpmdbNextIterator: skipping h#     729 Header V3 DSA signature: BAD, key 
ID e42d547b

Please notice that the above output is truncated due to the segfault error:

rpm -qavv 2>&1 | wc -l
     69

According to RPM database, signatures installed are:

rpm -qa --nosignature | grep pubkey
gpg-pubkey-c431416d-3db4c821
gpg-pubkey-e42d547b-3960bdf1

However, they don't really seem to be installed:

rpm -q --nosignature gpg-pubkey-e42d547b-3960bdf1
package gpg-pubkey-e42d547b-3960bdf1 is not installed
rpm -q --nosignature gpg-pubkey-c431416d-3db4c821
package gpg-pubkey-c431416d-3db4c821 is not installed

And this makes it impossible for me to remove them:

rpm -evv gpg-pubkey-e42d547b-3960bdf1
D: unshared posix mutexes found(38), adding DB_PRIVATE, using fcntl lock
D: opening  db environment /var/lib/rpm/Packages create:cdb:mpool:private
D: opening  db index       /var/lib/rpm/Packages rdonly mode=0x0
D: locked   db index       /var/lib/rpm/Packages
D: opening  db index       /var/lib/rpm/Name rdonly mode=0x0
error: package gpg-pubkey-e42d547b-3960bdf1 is not installed
D: closed   db index       /var/lib/rpm/Name
D: closed   db index       /var/lib/rpm/Packages
D: closed   db environment /var/lib/rpm/Packages

rpm -evv gpg-pubkey-c431416d-3db4c821
D: unshared posix mutexes found(38), adding DB_PRIVATE, using fcntl lock
D: opening  db environment /var/lib/rpm/Packages create:cdb:mpool:private
D: opening  db index       /var/lib/rpm/Packages rdonly mode=0x0
D: locked   db index       /var/lib/rpm/Packages
D: opening  db index       /var/lib/rpm/Name rdonly mode=0x0
error: package gpg-pubkey-c431416d-3db4c821 is not installed
D: closed   db index       /var/lib/rpm/Name
D: closed   db index       /var/lib/rpm/Packages
D: closed   db environment /var/lib/rpm/Packages

Seems like we're narrowing down the problem... let me know if this helps you 
understand the problem. I will repeat the procedures above on my home computer 
later on today, if there's any outcome different than this one, I will post 
results back here.

Comment 12 Jeff Johnson 2003-10-10 13:10:32 UTC
The problem is narrowed down to key e42d547b.

Does removing the key "fix" or not?

Does reimporting the key reproduce the segfault?

Comment 13 Andre Costa 2003-10-10 14:10:56 UTC
(trying all operations below on my home computer)

AFAICS this key is from Freshrpms.net (which is consistent with my idea that the it was an 
'apt-get upgrade' that triggered the problem):

(file RPM-GPG-KEY.txt below obtained from http://freshrpms.net/packages/RPM-GPG-KEY.txt)

rpm --nosignature --import RPM-GPG-KEY.txt
rpm -qi --nosignature gpg-pubkey-e42d547b-3960bdf1
Name        : gpg-pubkey                   Relocations: (not relocateable)
Version     : e42d547b                          Vendor: (none)
Release     : 3960bdf1                      Build Date: Fri 10 Oct 2003 10:34:43 AM BRT
Install Date: Fri 10 Oct 2003 10:34:43 AM BRT      Build Host: localhost
Group       : Public Keys                   Source RPM: (none)
Size        : 0                                License: pubkey
Signature   : (none)
Summary     : gpg(Matthias Saou (Thias) <matthias.saou.marmotte.net>)
Description :
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: rpm-4.2 (beecrypt-2.2.0)
[snip]
-----END PGP PUBLIC KEY BLOCK-----

I have reimported (apparently successfully) all keys I had here at my home computer:

rpm -qa --nosignature | grep pubkey
gpg-pubkey-db42a60e-37ea5438
gpg-pubkey-e42d547b-3960bdf1
gpg-pubkey-c431416d-3db4c821

They all react properly to rpm -qi --nosignature queries. Keys were imported from:

gpg-pubkey-db42a60e-37ea5438 --> Red Hat, Inc <security>
gpg-pubkey-e42d547b-3960bdf1 --> Matthias Saou (Thias) <matthias.saou.marmotte.
net>
gpg-pubkey-c431416d-3db4c821 --> JPackage Project (JPP Official Keys) <jpackage@zarb.
org>

Reimporting the keys (with 'rpm --nosignature --import') did not fix the problem:

rpm -qavv
D: unshared posix mutexes found(38), adding DB_PRIVATE, using fcntl lock
D: opening  db environment /var/lib/rpm/Packages create:cdb:mpool:private
D: opening  db index       /var/lib/rpm/Packages rdonly mode=0x0
D: locked   db index       /var/lib/rpm/Packages
D: opening  db index       /var/lib/rpm/Pubkeys rdonly mode=0x0
error: rpmdbNextIterator: skipping h#    1694 Header V3 DSA signature: BAD, key ID e42d547b
D:  read h#    1709 Header sanity check: OK
D: ========== ??? pubkey id 0000000000000000
Segmentation fault

(this is the actual whole output)

I then tried rebuilding the DB with all keys (other than RH's) removed. It didn't work:

rpm -qa --nosignature | grep pubkey
gpg-pubkey-db42a60e-37ea5438
rpm --nosignature --rebuilddb -vv
[snip]
D:  read h#    1686 Header V3 DSA signature: OK, key ID db42a60e
D:   +++ h#     163 Header V3 DSA signature: OK, key ID db42a60e
D: adding "openssl096b" to Name index.
D: adding 14 entries to Basenames index.
D: adding "System Environment/Libraries" to Group index.
D: adding 15 entries to Requirename index.
D: adding 3 entries to Providename index.
D: adding "openssl" to Conflictname index.
D: adding 3 entries to Dirnames index.
D: adding 15 entries to Requireversion index.
D: adding 3 entries to Provideversion index.
D: adding 1 entries to Installtid index.
D: adding 1 entries to Sigmd5 index.
D: adding "fd18c27c0377beec5d4bc71522ed0e20e7a73973" to Sha1header index.
D: adding 14 entries to Filemd5s index.
Segmentation fault

(let me know if you need the full output)

FYI: this still appears a lot on rpm --rebuilddb -vv output:

error: rpmdbNextIterator: skipping h#    1694 Header V3 DSA signature: BAD, key
ID e42d547b

What's the deal here? Since it is not installed anymore, why does key e42d547b still appear on 
the rebuilddb? Did I miss anything?

Jeff, would it help if I sent you a copy of my /var/lib/rpm files?

Comment 14 Andre Costa 2003-10-10 16:38:32 UTC
More interesting data: I am now on my office computer, and I seem to have 
'ghost' pubkeys installed:

rpm -qa --nosignature | grep pubkey
gpg-pubkey-c431416d-3db4c821
gpg-pubkey-c431416d-3db4c821
gpg-pubkey-e42d547b-3960bdf1
gpg-pubkey-e42d547b-3960bdf1

(yeah, both appear twice. This is because I have just tried to re-import them 
again on top of the 'ghost ones')

Trying to remove them with --allmatches doesn't do any good:

rpm -ev --nosignature --allmatches gpg-pubkey-c431416d-3db4c821
rpm -ev --nosignature --allmatches gpg-pubkey-e42d547b-3960bdf1
rpm -qa --nosignature | grep pubkey
gpg-pubkey-c431416d-3db4c821
gpg-pubkey-e42d547b-3960bdf1

rpm -q --nosignature gpg-pubkey-c431416d-3db4c821 gpg-pubkey-e42d547b-3960bdf1
package gpg-pubkey-c431416d-3db4c821 is not installed
package gpg-pubkey-e42d547b-3960bdf1 is not installed

I can't seem to get rid of these two keys (c431416d is for JPackage and e42d547b 
is for Freshrpms) -- actually I don't even know if they are installed or not. 
Any ideas?

TIA

Comment 15 Richard 2003-10-29 10:57:53 UTC
I have this exact same problem..

I have managed to fix it by finding the bad package..

rpm -qa --nosignature > /tmp/list
rpm -qa > /tmp/list2
diff /tmp/list /tmp/list2 | more

The first item in the diff output is the bad package..

If i try rpm --erase <pkg>  i get a segmentation fault
if i try rpm --erase --nosignature <pkg>  it erased it ok
(my package was glut-3.7-12  btw)

I can now run rpm --rebuilddb without a seg fault

During the rebuild I had some errors:
error: rpmdbNextIterator: skipping h#     168 blob size(3768): BAD, 8 + 16 *
il(17) + dl(3456)
and a couple of others for h# 778

but once the rebuilddb completed, nothing has any errors anymore


Comment 16 Andre Costa 2003-10-31 23:42:30 UTC
Hi Richard,

thks for the tip, will try that right away.

... but, before that, I must say I did it again: even using 
--nosignature did not help me keeping away from screwing up my 
database. I got tons of these while trying to upgrade some new 
packages I downloaded from Freshrpms (yes, I insist on it ;)) :

rpmdb: fatal region error detected; run recovery
error: db4 error(-30982) from dbenv->close: DB_RUNRECOVERY: Fatal 
error, run database recovery

This time I ran rpmdb_verify on the corrupted database, and got this:

/usr/lib/rpm/rpmdb_verify Packages
db_verify: Page 3565: page 2960 encountered a second time on free list
db_verify: DB->verify: Packages: DB_VERIFY_BAD: Database verification 
failed

Hopes this helps someone understand what went wrong. I made a backup 
of the corrupted database in case someone wants to take a look at it.

Best,

Andre

PS: in order to upgrade my box with the broken database is: I download 
packages with 'apg-get -d upgrade', and then install them with 'rpm 
-Uvh --nosignature /var/cache/apt/archive/*.rpm'

Comment 17 Holger Ronecker 2003-11-18 12:36:29 UTC
Same problem here, but with other keys. I followed everything in this 
bug-report and got the same errors along the way (rpm --rebuilddb 
didn't work, pakages ok, problem with the keys, etc)

Then I got to the "Additional Comment #15" from Richard and his 
suggestion fixed the problem for me.

The Package that made trouble here was apt

regards, Holger

Comment 18 Jeff Johnson 2003-12-27 20:57:06 UTC
One bug per report please.

Andre: This should fix the problem seen by rpmdb_verify
    cd /var/lib/rpm
    mv Packages Packages-ORIG
    /usr/lib/rpm/rpmdb_dump Packages-ORIG | \
        /usr/lib/rpm/rpmdb_load Packages
    /usr/lib/rpm/rpmdb_verify Packages
    rpm --rebuilddb -vv

Comment 19 Andre Costa 2003-12-27 21:08:08 UTC
Hi Jeff,

I have upgraded to FC1, so I cannot reproduce the problem anymore (not that I would 
like to ;)). So far, haven't had any similar probl with FC1, hope it stays this way =)

Thks for still working on this issue anyway. I have saved your last hint in case I run into 
this issue again.

Happy new year,

Andre

Comment 20 Jeff Johnson 2003-12-27 21:22:34 UTC
OK, thanks. You might want to upgrade to rpm-4.2.2-0.6
from fc2. There's a dangling ptr fixed there.

Happy New Year!