Bug 62897

Summary: RPM database exhibits seemlingly random corruption
Product: [Retired] Red Hat Linux Reporter: Ed Voncken <redhat-bugzilla>
Component: rpmAssignee: Jeff Johnson <jbj>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 7.2   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-04-07 18:46:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ed Voncken 2002-04-07 12:21:34 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:0.9.9) Gecko/20020311

Description of problem:
Since upgrading to RedHat Linu8x 7.2 /i386, I have encountered several cases of
the RPM database becoming corrupted.

In most cases, running 'rpm -qa' would start showing the list of installed
packages but it would freeze sometime during the listing.

'rpm --rebuilddb' apparently fixes the database symptoms but the cause of the
problems is unknown.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
Most recent occurrence:

[root@r2d2 i386]# rpm -Fvh *.rpm
Preparing...                #######################             ( 54%)

(freeze at this point, reproducible at 54%; interrupted using Ctrl-C after more
than one minute of waiting)

(maybe the file list was too long; the directory contains ALL updates for RH7.2;
so I trimmed down the list)

[root@r2d2 i386]# rpm -Fvh [a-z]*.rpm
Preparing...                #######################             ( 54%)

(freeze at this point, reproducible at 54%; interrupted using Ctrl-C after more
than one minute of waiting)

(apparently the length of the file list is not directly linked to the problem)

[root@r2d2 i386]# rpm -Fvh [a-j]*.rpm
error: failed dependencies:
        librpm-4.0.4.so   is needed by gnorpm-0.96-12.7x
        librpmdb-4.0.4.so   is needed by gnorpm-0.96-12.7x
        librpmio-4.0.4.so   is needed by gnorpm-0.96-12.7x
error: db3 error(-30998) from db->close: DB_INCOMPLETE: Cache flush was unable
to complete
rpmdb: Non-empty page 941 in unused hash bucket 771
rpmdb: Page 1280 encountered a second time on free list
error: db3 error(-30985) from db->verify: DB_VERIFY_BAD: Database verification
failed


Actual Results:  RPM database corrupted.

Expected Results:  RPM database NOT corrupted ;)

Additional info:

[root@r2d2 i386]# grep rpm /root/logs/rpm-qa
gnorpm-0.96-11
rpm-4.0.3-1.03
rpm-build-4.0.3-1.03
rpmdb-redhat-7.2-0.20010924
rpmfind-1.7-2
rpmlint-0.32-4
rpm-perl-4.0.3-1.03
rpm-python-4.0.3-1.03

Comment 1 Jeff Johnson 2002-04-07 16:09:47 UTC
Try
	rm /var/lib/rpm/__db*
Alternatively, do rpm --rebuilddb.

Comment 2 Ed Voncken 2002-04-07 18:46:03 UTC
Hi,

As I already mentioned in my report, the workaround 'rpm --rebuilddb' does
indeed (temporarily) fix the corruption.

However, a solution for the /cause/ of the corruption would be more interesting
IMHO ;)

Strace output with "rpm -Fvh *.rpm" hanging at 54% :

select(0, NULL, NULL, NULL, {0, 760000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>

Greetings,
  Ed.

Comment 3 Jeff Johnson 2002-04-07 18:56:40 UTC
You need to do
	rm -f /var/lib/rpm/__db*
if you ^C. Yes, this is done automatically
on the next rpm execution, but this
does not solve the problem of root creates,
but non-root cannot remove. Yes this can/will
be handled by a setgid helper program or by
a different locking scheme in Berkeley DB. Howvere,
this ain't gonna happen soon.