Bug 115432

Summary: rpmdb corruption on RHEL3
Product: Red Hat Enterprise Linux 3 Reporter: Dag Wieers <dag>
Component: rpmAssignee: Jeff Johnson <jbj>
Status: CLOSED WONTFIX QA Contact: Mike McLean <mikem>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: msw
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-17 01:04:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dag Wieers 2004-02-12 14:14:24 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040124 Galeon/1.3.12

Description of problem:
Yet another rpmdb corruption problem with RHEL3. It started when
installing tomcat from the Application Server Beta channel.

    [root@dev-srv7 RPMS.rhapp]# rpm -ihvU tomcat-4.1.27-8.x86_64.rpm
    warning: tomcat-4.1.27-8.x86_64.rpm: V3 DSA signature: NOKEY, key
ID 897da07a
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: Failed dependencies:
        commons-beanutils >= 1.6.1-10 is needed by tomcat-4.1.27-8
        commons-collections >= 2.1-9 is needed by tomcat-4.1.27-8
        commons-digester >= 1.4.1-10 is needed by tomcat-4.1.27-8
        commons-fileupload >= 1.0-5 is needed by tomcat-4.1.27-8
        commons-modeler >= 1.0-5 is needed by tomcat-4.1.27-8
        lib-javax-servlet-4.1.7.so()(64bit) is needed by tomcat-4.1.27-8
        lib-org-apache-catalina-bootstrap-4.1.27.so()(64bit) is needed
by tomcat-4.1.27-8
        lib-org-apache-jasper-4.1.27.so()(64bit) is needed by
tomcat-4.1.27-8
        lib-org-apache-naming-bootstrap-4.1.27.so()(64bit) is needed
by tomcat-4.1.27-8
        servletapi >= 4.1.7-10 is needed by tomcat-4.1.27-8
        tomcat-libs = 4.1.27-8 is needed by tomcat-4.1.27-8
    Suggested resolutions:  
        commons-beanutils-1.6.1-10.x86_64.rpm
        commons-collections-2.1-9.x86_64.rpm
        commons-digester-1.4.1-10.x86_64.rpm
        commons-fileupload-1.0-5.x86_64.rpm
        commons-modeler-1.0-5.x86_64.rpm
        servletapi-4.1.7-10.x86_64.rpm
        tomcat-libs-4.1.27-4.x86_64.rpm

When all dependencies were resolved:

    [root@dev-srv7 RPMS.rhapp]# rpm -ihvU tomcat-4.1.27-8.x86_64.rpm
tomcat-libs-4.1.27-8.x 86_64.rpm commons-fileupload-1.0-5.x86_64.rpm
servletapi-4.1.7-10.x86_64.rpm commons-beanutils-1.6.1-10.x86_64.rpm
commons-collections-2.1-9.x86_64.rpm
commons-digester-1.4.1-10.x86_64.rpm commons-modeler-1.0-5.x86_64.rpm
    warning: tomcat-4.1.27-8.x86_64.rpm: V3 DSA signature: NOKEY, key
ID 897da07a
    warning: commons-fileupload-1.0-5.x86_64.rpm: V3 DSA signature:
NOKEY, key ID db42a60e
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    Preparing...                error: db4 error(-30989) from
dbcursor->c_get: DB_PAGE_NOTFOUND: Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
########################################### [100%]
   1:commons-collections   
########################################### [ 13%]
   2:commons-beanutils     
########################################### [ 25%]
   3:commons-digester      
########################################### [ 38%]
   4:servletapi            
########################################### [ 50%]
   5:commons-fileupload    
########################################### [ 63%]
   6:commons-modeler       
########################################### [ 75%]
   7:tomcat-libs           
########################################### [ 88%]
   8:tomcat                
########################################### [100%]

Then I do a rebuild of the rpmdb:

    [root@dev-srv7 RPMS.rhapp]# rpm --rebuilddb
    error: rpmdbNextIterator: skipping h#    1076 blob size(18884):
BAD, 8 + 16 * il(7499631) + dl(1946186351)
    error: db4 error(-30989) from dbcursor->c_get: DB_PAGE_NOTFOUND:
Requested page not found
    [root@dev-srv7 RPMS.rhapp]# rpm --rebuilddb
    [root@dev-srv7 RPMS.rhapp]# rpm -qa | wc -l
        132

Database seems to be nuked, a lot of base packages are missing and I
don't see an easy way of recovering from this.

For a product this expensive, this long-time rpmdb corruption should
have been fixed by now.


Version-Release number of selected component (if applicable):


How reproducible:
Didn't try


Additional info:

Comment 1 Jeff Johnson 2004-02-12 20:17:11 UTC
Heh, this problem will be fixed if I ever get a reproducer.

What version of rpm?

What are you using to install, yum?

Are you installing in a chroot?

I claim that this not corruption per-se, but rather a
cache coherency problem.

Translation:
If you do "rm -f /var/lib/rpm/__db*", then all the installed
packages are there. No --rebuildb, nothing else.

If you can, try to verify before and after pkg counts using
wc -l (doing rm -f /var/lib/rpm/__db*) as needed.


Comment 2 Dag Wieers 2004-02-12 23:06:50 UTC
Sorry for not being very clear about it. It was a clean RHEL3 U1
installation, no updates except those from U1 were installed. The
installation was not done in a chroot, it was on a IBM e325.

No other tools except rpm were used. No apt, no yum, no up2date. It
was an almost complete install without any extras except the
Application Server beta 2.95 RPMs.

I tried your suggested solution:

    [root@dev-srv7 root]# rpm -qa | wc -l
        132
    [root@dev-srv7 root]# rm -f /var/lib/rpm/__db*
    [root@dev-srv7 root]# rpm -qa | wc -l
        132

If my rpm --rebuild made a non-problematic situation worse, that's
worth another bug-report ;-) But I think my rpmdb was already
corrupted beyond repair.

This may be AMD64 related although some of my personal rpmdb
corruption encounters have been unrelated to arch or chroot. (RH80,
RH9 and RHFC1 on x86)

Comment 3 Jeff Johnson 2004-02-12 23:18:30 UTC
This isn't corruption, this is cache coherency, don't confuse with
other problems.

Yeah there's a bug there, look for DB_PAGE_NOT_FOUND in rpm bugs.

The bug appears to be tracking with non-ix86. I have credible
claim that yum with --installroot produces rapidly on ppc64.

I have other reliable (but anecdotal) claims on x86_64 as well.

If you got the time, could you s/--enable-posixmutexes// in
the RHEL spec file, rebuild, and try to reproduce?

That removes NPTL, locking and cache coherency are intimately related.

Comment 4 Dag Wieers 2004-10-17 01:04:22 UTC
Closing it as I have not performed other tests on the same machine.
It's a mystery to me.