211917 – rpmdb: PANIC: fatal region error detected; run recovery

Bug 211917 - rpmdb: PANIC: fatal region error detected; run recovery

Summary: rpmdb: PANIC: fatal region error detected; run recovery

Keywords:
Status:	CLOSED DUPLICATE of bug 181363
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	anaconda
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Anaconda Maintenance Team
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	FC7Blocker
TreeView+	depends on / blocked

Reported:	2006-10-23 20:41 UTC by Robert Scheck
Modified:	2007-11-30 22:11 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-11-14 11:43:28 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Screenshot (319.14 KB, image/jpeg) 2006-10-23 20:41 UTC, Robert Scheck	no flags	Details
anaconda install.log (760.40 KB, text/plain) 2006-11-02 00:01 UTC, grant petersen	no flags	Details
View All

Description Robert Scheck 2006-10-23 20:41:56 UTC

Description of problem:
During textbased installation of Fedora Core 6, there's an error: rpmdb:
PANIC: fatal region error detected; run recovery - further details at the 
attached screenshot.

Version-Release number of selected component (if applicable):
anaconda-11.1.1.3-1
rpm-4.4.2-32

How reproducible:
Everytime without doing something special, but I'm absolutely not able to
figger out why...

Actual results:
rpmdb: PANIC: fatal region error detected; run recovery

Expected results:
Simply a working installation, not more and not less :)

Comment 1 Robert Scheck 2006-10-23 20:41:56 UTC

Created attachment 139161 [details]
Screenshot

Comment 2 Paul Nasrat 2006-10-24 08:46:49 UTC

Are there any messages on tty3 and tty4 can you go into tty2 and grab the logs
under /tmp and /mnt/sysimage.

Comment 3 Robert Scheck 2006-10-27 21:12:00 UTC

If I could reproduce it now again, I would be happy. Looks like system gets out 
of memory (192 MB RAM), but I'm not sure.

Comment 4 Jeff Johnson 2006-10-28 07:31:56 UTC

Running out of resources would explain a lot.

Comment 5 Andrew 2006-10-30 17:42:34 UTC

I had the same problem upgrading FC5 to FC6 on an HP Pavilion n5430 notebook
(AMD Duron 850MHz, 128MB RAM).  The "free" command showed plenty of swap unused,
so I think memory was not the issue.  I don't have the logs any more, but the
upgrade log was 50MB at one point.

Comment 6 grant petersen 2006-11-02 00:01:47 UTC

Created attachment 140057 [details]
anaconda install.log

this is the log file from my last install attempt. I could boot this system if
I added selinux=0 to the boot prompt using grub to edit the boot line.

Comment 7 grant petersen 2006-11-02 00:02:47 UTC

I've had possibly this problem upgrading from fed5 and installing fresh, text
mode and gui, local cd and http, dhcp and static ip, autopartion LVM and manual
partition and every time it failed. Tried nolapic etc and selinux=0.
The failure varies but the most common way is for it to halt while installing a
package. Sometimes tty0 shows "db4 error..." sometimes not but I think it could
be grinding too hard to update tty0 in these cases. The system sometimes seems
to lock up and needs a hard reset (no mouse of kbrd response) but most often
will respond to ctrl+alt+bksp (in X) or ctl+alt+del.
Top shows anaconda using 98 to 100% cpu. I left this overnight once and it was
the same in the morning.
 Intel(R) Pentium(R) 4 CPU 2.00GHz
 256meg Mem.
I re-installed fedora 5 to check the box and it still installs fine.

Comment 8 Joachim Frieben 2006-11-02 09:27:02 UTC

I have the some on an "IBM ThinkPad T23". However, I am not affected by
this issue at install time but rather -afterwards- when I try to install
additional packages. This happens even in single user mode by a mere
"rpm -i ..." command but not necessarily immediately but after an a
priori number of packages. When this happens, the system freezes
completely during the transaction. It has to be powered off by hand.
If the packages were additional ones, the last package may be corrupted,
may be not. If packages were scheduled to be updated, both packages
versions are then displayed when one queries for the name of the
package. After a certain number of incidents of this kind, one finally
reads:

  rpmdb: PANIC: fatal region error detected; run recovery
  error: db4 error(-30977) from dbenv->open: DB_RUNRECOVERY: Fatal error,
         run   database recovery
  error: cannot open Packages index using db3 -  (-30977)
  error: cannot open Packages database in /var/lib/rpm

It is impossible to carry out any "rpm" action from now on. The system
thus needs a complete reinstall. This bug is really nasty. The issue
occurred with "FC6" and current "rawhide". After reverting the system
to "FC5" [plus updates or not] nothing similar ever happens.
The priority should be set to "high" or even "urgent" as this issue
makes "FC6/rawhide" essentially unusable.

Comment 9 Robert Scheck 2006-11-02 12:43:05 UTC

Joachim, how many RAM do you have?

Comment 10 Joachim Frieben 2006-11-02 13:33:58 UTC

(In reply to comment #9)
The amount of system memory is 512 MB. "SELinux" is enabled but set to
"permissive" mode.

Comment 11 Jeff Johnson 2006-11-02 16:16:58 UTC

Instead of reinstalling, you might try to fix the existing database
as suggested in the error message.

0) Save a copy of existing database.
    cd /var/lib
    tar czvf rpmdb.tar.gz ./rpm

1) Verify the contents of Packages.
    cd /var/lib/rpm
    /usr/lib/rpm/rpmdb_verify Packages

2) If there are problems, then dump and load to recover what can be salvaged.
    cd /var/lib/rpm
    mv Packages Packages-ORIG
    /usr/lib/rpm/rpmdb_dump Packages-ORIG | \
        /usr/lib/rpm/rpmdb_laod Packages

3) Rebuild the indices.
    rpm --rebuilddb -vv

Comment 12 Joachim Frieben 2006-11-02 17:31:42 UTC

(In reply to comment #11)
As matter of fact, I'm using current "rawhide" again, and 2 hours
ago, I had the same trouble again [rpmdb: PANIC: ...] when installing
today's and yesterday's updates. Interestingly, after rebooting the
system, "rpm" commands were available again. To be on the safe side,
I ran an "rpm --rebuilddb" [before reading your post], and step by
step, I installed all packages again [using the "--force" option to
remove remaining old packages]. I have experienced no problems
during this 200 MB install. There was an "rpm" update, but it remains
to be seen, if this has had some impact.

Comment 13 Jeff Johnson 2006-11-02 18:02:46 UTC

The rawhide rpm update fixes a segfault and is unrelated to your problem.

Be careful with --rebuilddb, make sure you do
    rm -f /var/lib/rpm/__db*
first, or a corrupted cache can do more damage than good.

Reinstalling all packages with --force is unnecessary imho. Will neither
hurt nor help. Try running
    rpm -Va
instead.

What version of glibc and kernel are installed? My guess is that
there's something different with your machines, as a rpmdb PANIC
is a very loud signal, and is clearly not happening everywhere.

The underlying common mechanism is NPTL, used by rpmdb for locking, supplied by kernel/glibc.

Comment 14 Joachim Frieben 2006-11-02 21:44:18 UTC

(In reply to comment #13)
Hm, I actually did not remove the '__db*' files, but this doesn't seem
to have done any harm. I haven't had any problems since then.
Reinstalling packages with "--force" had been used to remove the
obsolete duplicates from the data base ['rpm -e name-old.rpm' would
have probably done the job, too] after the system had got frozen in
midst of an "RPM" transaction, and to reinstall corrupted packages
as indicated by "rpm -V new.rpm' [usually the last one].
My system is a current "rawhide" system: kernel-2.6.18-1.2798.fc6,
glibc-2.5.90-3. I recall: the issue also appeared immediately
after a fresh "FC6" install.

Comment 15 Robert Scheck 2006-11-02 22:00:19 UTC

I'm curious wether the problem is related to the RPM delivered with Fedora
Core 6, because I'm running Fedora Core 6 equivalent systems with RPM 4.4.7/
4.4.8-devel on different machines and I'm not able to reproduce the problem 
independent of what I'm trying to do.

Somebody of the people here willing or interested to try a patched (and thus 
Fedora compatible) RPM 4.4.7/4.4.8-devel at a installed "broken" Fedora Core 6 
system? Feel free to contact me via e-mail...

Comment 16 Jack Spaar 2006-11-03 02:19:05 UTC

I'm seeing the rpm PANIC problem too on a laptop upgraded from FC5 to FC6.
No problems during the upgrade itself, only after when installing updates.

I did a --rebuilddb and it seemed OK for a while, then yum started to hang again
and rpm --rebuilddb PANICed.

A reboot seems to have cleared the problem, at least temporarily.

I had been monitoring the CPU temperature and it was well within safe range
before the PANIC.  Smartctl testing shows no errors with the drive.  Memtest86
is clean.

glibc-2.5-3
kernel-2.6.18-1.2786.fc6  (the i686 version, forceably replacing i586)

256M RAM, 458744k swap, GBs free on /

Comment 17 Elaine Normandy 2006-11-03 14:07:45 UTC

I seem to be having this problem on my newly installed FC6 box also.  I had no
problems running rpm on my FC5 install. I have a generically built AMD box that
has been running some version of Linux (both FC and Ubuntu) without complaint
since we got it three years ago.  I can provide additional details if you tell
me the commands, since it has been a while since I had to check out hardware
details. 

I am willing to run a patched version of rpm if it would help solve the problem.
 I am getting awfully tired of running "rm -rf __*; rpm --rebuilddb".

Comment 18 Joachim Frieben 2006-11-03 16:20:18 UTC

I have been able to trigger system freezes also by unpacking a bzip2
compressed tar file of about 35 MB [in single user mode]. It took me
a couple of trials to extract the file completely. I am thus wondering
if the issue is not related to a generic file I/O issue. There have
been issues with the 2.6.18 kernel, e.g. bug 209005.
According to comment #13, it might be related to a thread issue with
the C library or the kernel.

Comment 19 Joachim Frieben 2006-11-05 00:10:20 UTC

Today, I have reinstalled my "rawhide" system from scratch. After booting
into single user mode, I tried to install "kernel-2.6.18-1.2835.fc6" from
"updates/testing/6". The system froze three times consecutively. Only
after *disabling* hard disk "DMA", the installation completed successfully.
This observation may point to the root of the issue.
After rebooting the system, "kernel-2.6.18-1.2835.fc6" turned out to be a
huge improvement. I have been able to install a huge number of additional
"RPM" packages without any problem.
I strongly recommend all affected users to retrieve and to install this
new "FC6" testing kernel "RPM"!

Comment 20 Joachim Frieben 2006-11-06 16:02:41 UTC

After 3 days of usage without any glitch, "kernel-2.6.18-1.2835.fc6"
can be considered to settle the issue for me completely.

Comment 21 Joachim Frieben 2006-11-13 20:53:29 UTC

Annoying: random freezes again with "kernel-2.6.18-1.2849.fc6".

Comment 22 Jeff Johnson 2006-11-13 21:07:04 UTC

Freezes? Panics? What was running?

rpm runs with signals masked, so the annoyance is usually from having
to get your terminal back.

Try ^Z suspend and then kill -9 whatever process.

Remember that you need to do "rm -f /var/lib/rpm/__db*" after *every*
exceptional even, like "kill -9", segfaults, reboots, ENOSPC, EIO, etc, etc.

Comment 23 Joachim Frieben 2006-11-14 11:35:55 UTC

The whole system freezes including "X". In text mode, no reaction
to keyboard input anymore.
The issue can be triggered by increasing disk I/O e.g. installing
packages, creating "tar" files, but this can of course happen at
any time given the regular disk I/O traffic by the OS.

Comment 24 Paul Nasrat 2006-11-14 11:43:28 UTC


*** This bug has been marked as a duplicate of 181363 ***

Comment 25 Jeff Johnson 2006-11-14 11:47:51 UTC

Tell the kernel debutantes, not me, please.

Note You need to log in before you can comment on or make changes to this bug.