Bug 169145 - rpm - a nasty database corruption incident
Summary: rpm - a nasty database corruption incident
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm
Version: 4
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Paul Nasrat
QA Contact: Mike McLean
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-09-23 17:05 UTC by Michal Jaegermann
Modified: 2007-11-30 22:11 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2005-11-28 17:52:02 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Michal Jaegermann 2005-09-23 17:05:33 UTC
Description of problem:

While attempting to install the latest kernel from updates rpm refused
to cooperate with the following error messages:

rpmdb: PANIC: fatal region error detected; run recovery
error: db4 error(-30977) from dbenv->open: DB_RUNRECOVERY: Fatal error, run
database recovery
error: cannot open Packages index using db3 -  (-30977)
error: cannot open Packages database in /var/lib/rpm

The first attempt to 'rpm --rebuilddb' resulted in the same error as above.
The second attempt destroyed the whole database leaving in it two random
packages.  In a retrospect "run recovery" probably meant something else
although it is not that clear what. 'db_dump' followed by 'db_load'?

Luckily a pretty recent backup was available and after restoring /var/lib/rpm/
the whole system was brought back into a consistent state.

Version-Release number of selected component (if applicable):
rpm-4.4.1-22

How reproducible:
Hopefuly not often.

Comment 1 Paul Nasrat 2005-10-25 20:44:26 UTC
What was the initial system state? What packages were in the upgrade transaction?

Comment 2 Michal Jaegermann 2005-10-25 22:53:06 UTC
> What was the initial system state?

You mean what was installed?  Everything was updated to the current
available level at that time.  My logs show that on Sept-22, before I got
hit by that error the following packages were installed:

man-pages.noarch 1.67-8
xorg-x11-Mesa-libGL.i386 6.8.2-37.FC4.49.2
xorg-x11-Mesa-libGLU.i386 6.8.2-37.FC4.49.2
xorg-x11-deprecated-libs-devel.i386 6.8.2-37.FC4.49.2
xorg-x11-deprecated-libs.i386 6.8.2-37.FC4.49.2
xorg-x11-devel.i386 6.8.2-37.FC4.49.2
xorg-x11-font-utils.i386 6.8.2-37.FC4.49.2
xorg-x11-libs.i386 6.8.2-37.FC4.49.2
xorg-x11-tools.i386 6.8.2-37.FC4.49.2
xorg-x11-twm.i386 6.8.2-37.FC4.49.2
xorg-x11-xauth.i386 6.8.2-37.FC4.49.2
xorg-x11-xdm.i386 6.8.2-37.FC4.49.2
xorg-x11-xfs.i386 6.8.2-37.FC4.49.2
xorg-x11.i386 6.8.2-37.FC4.49.2

> What packages were in the upgrade transaction?

Again from what I see in my logs these were, the next day,

shadow-utils.i386 2:4.0.12-4.FC4
kernel.i686 2.6.12-1.1456_FC4

and the problem struck during a new kernel installation.

After that I restored rpm databases from a backup, resynchronized with
the real state of the system and so far, knock-on-the-wood, everything
in that area works fine. 'shadow-utils' package in the meantime got
replaced with 4.0.12-5.FC4 and kernel is also not the same (2.6.13-1.1532_FC4
at this moment).

Both rpm, 4.4.1-22, and db4, 4.3.27-3, are still the same as at the time
of that trouble.


Comment 3 Jeff Johnson 2005-11-14 03:03:18 UTC
This problem is unlikely to be reproducible. A fix is exactly as likely as a reproducer.

Comment 4 Michal Jaegermann 2005-11-14 07:25:52 UTC
> This problem is unlikely to be reproducible.
I think so too.  The main reason behind this report was that maybe somebody
have seen something similar.  I would actually suspect some rare gotchas in
an underlying database.

Comment 5 Paul Nasrat 2005-11-28 17:52:02 UTC
Closing out.  If you get a reproducible case please reopen.


Note You need to log in before you can comment on or make changes to this bug.