Bug 53199

Summary: db gets corrupted (DB_VERIFY_BAD) after rpm -U
Product: [Retired] Red Hat Raw Hide Reporter: Sami Farin <safari+rhbug>
Component: rpmAssignee: Jeff Johnson <jbj>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 1.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-09-30 19:35:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sami Farin 2001-09-05 01:30:45 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.3) Gecko/20010803

Description of problem:
# rpm -Uvh indexhtml-7.1-2.noarch.rpm man-pages-1.39-1.noarch.rpm
   1:indexhtml              ########################################### [ 50%]
   2:man-pages              ########################################### [100%]
error: db3 error(-30998) from db->close: DB_INCOMPLETE: Cache flush was
unable to complete
rpmdb: Unreferenced page 76
rpmdb: Unreferenced page 457
error: db3 error(-30985) from db->verify: DB_VERIFY_BAD: Database
verification failed


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.run "rpm -U package"
2.goto 1
3.
	

Actual Results:  package got installed (and rpm -V shows OK result) but rpm
displayed error about database being bad.

Expected Results:  no errors I guess.

Additional info:

gcc-2.2.4 (gnu.org version, no rawhide)
gcc-3.0.1
binutils-2.11.2
redhat-5.0 (+many updated packages =) )
rpm-4.0.3-0.96 (self-compiled) (also rpm-4.0.3-0.88 showed similar database
corruption symptoms)

# rpm -qa|wc -l
    726

Comment 1 Jeff Johnson 2001-09-05 14:01:08 UTC
Have you dona a "rpm --rebuilddb" to fix the problems?

Comment 2 Sami Farin 2001-09-06 20:30:46 UTC
sorry, I forgot to tell you,...
yes, I have done "rpm --rebuilddb" already and after that I can do "rpm -U"
without getting errors.

Comment 3 Sami Farin 2001-09-06 20:31:19 UTC
sorry, I forgot to tell you,...
yes, I have done "rpm --rebuilddb" already and after that I can do "rpm -U"
without getting errors.

Comment 4 Jeff Johnson 2001-09-06 20:38:21 UTC
OK, worksforme.

Comment 5 Sami Farin 2001-09-19 01:08:22 UTC
no matter what I do, by time rpm databases blows up. at least once a week.
now "rpm -qvl sendmail" dumped core.
unfortunately gdb backtrace was not useful.

"rpm --rebuilddb" makes rpm not hang/crash on queries, but I'd
rather see a bugfix.

anyways, here some error msgs which might be of some help.

# rpm -e icmpinfo
warning:    erase unlink of /etc/rc.d/rc3.d/S58icmpinfo.init failed: No such
file or directory
warning:    erase unlink of /etc/rc.d/init.d/icmpinfo.init failed: No such file
or directory
error: db3 error(-30998) from db->close: DB_INCOMPLETE: Cache flush was unable
to complete
rpmdb: Overflow page 1119 of invalid type
error: db3 error(-30985) from db->verify: DB_VERIFY_BAD: Database verification
failed

...

and "rpm -qa" hangs in the middle of the action

# strace -vfttT rpm -qa

17:27:43.996887 pread(3,
"\0\0\0\0\1\0\0\0j\4\0\0\0\0\0\0\0\0\0\0\1\0\306\17\0\7"..., 4096, 4628480) =
4096 <0.000059>
17:27:43.997362 write(1, "gcc-c++-2.96-54\n", 16gcc-c++-2.96-54
) = 16 <0.000019>
17:27:43.997559 pread(3,
"\0\0\0\0\1\0\0\0n\4\0\0\0\0\0\0o\4\0\0\1\0\346\17\0\7\0"..., 4096, 4644864) =
4096 <0.000054>
17:27:43.997887 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) <0.005990>
17:27:44.004055 select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) <0.008839>
17:27:44.013088 select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout) <0.009947>
17:27:44.023232 select(0, NULL, NULL, NULL, {0, 8000}) = 0 (Timeout) <0.009516>
17:27:44.032948 select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout) <0.019798>
17:27:44.052947 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) <0.039774>
17:27:44.092904 select(0, NULL, NULL, NULL, {0, 64000}) = 0 (Timeout) <0.070282>
17:27:44.163363 select(0, NULL, NULL, NULL, {0, 128000}) = 0 (Timeout) <0.129344>
17:27:44.292880 select(0, NULL, NULL, NULL, {0, 256000}) = 0 (Timeout) <0.260040>
17:27:44.553095 select(0, NULL, NULL, NULL, {0, 512000}) = 0 (Timeout) <0.519553>
17:27:45.072825 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999719>
17:27:46.072722 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999720>
17:27:47.072623 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999724>
17:27:48.072523 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999724>
17:27:49.072425 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999976>
17:27:50.072579 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999468>
17:27:51.072227 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999771>
17:27:52.072173 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999724>
17:27:53.072074 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999671>
17:27:54.071921 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999724>
17:27:55.071823 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) <0.999812>

...

oh no.. now, after two days of the previous error, db totally crashed:

# rpm -qa
rpmdb: region error detected; run recovery.
error: db3 error(-30987) from dbenv->open: DB_RUNRECOVERY: Fatal error, run
database recovery
error: cannot open Packages index using db3 -  (-30987)



sure, I can access db again after "rpm --rebuilddb", but this is no fun anymore.
any ideas what could cause my problems?

Comment 6 Sami Farin 2001-09-25 16:24:24 UTC
now with rpm-4.0.3-1.04, after rpm --rebuilddb

# rpm -e libsndfile
error: db3 error(-30998) from db->close: DB_INCOMPLETE: Cache flush was unable
to complete
rpmdb: Overflow page 2503 of invalid type
error: db3 error(-30985) from db->verify: DB_VERIFY_BAD: Database verification
failed
and rpm -qa hangs the same way as in the previous comment.
I installed rpm-4.0.3-1.04 12 hours ago.

Comment 7 Jeff Johnson 2001-09-25 19:04:57 UTC
The first thing to do is (with the db3-utils package installed):
    cd /var/lib/rpm
     db_verify Packages

Almost certainly the next step is to do
    cd /var/lib/rpm
    mv Packages Packages-ORIG
    db_dump Packages-ORIG | db_load Packages
    rpm --rebuilddb

Comment 8 Jeff Johnson 2001-09-25 19:08:15 UTC
Also, to get rid of the select, do the following
    rm -f /var/lib/rpm/__db*



Comment 9 Sami Farin 2001-09-29 22:56:09 UTC
db_verify exited with return code 0.

humm... recreating Packages made rpm database work for extra four days.. but now
rpm -qa hangs and I also get
error: db3 error(-30998) from db->close: DB_INCOMPLETE: Cache flush was unable
to complete
does rpm --rebuilddb recreate the database as is done with command "db_dump
xxx-orig | db_load xxx" ?



Comment 10 Jeff Johnson 2001-09-30 13:56:27 UTC
Mostly, yes, using --rebuilddb is the same as db_dump | db_load
but there are important differences as well. db_dump converts
all database structures to hexadecimal output understood by
db_load, while --rebuilddb works at a higher level, retrieving
headers and regenerating indices.

Comment 11 Jeff Johnson 2001-09-30 15:54:00 UTC
Hmmm, if this problem is recurrent and not fixed
by (once) db_dump|db_load and an rpm --rebuilddb, then
I suggest that you start looking for reasons outside
of rpm for why the problem reoccurs.

How exactly are you installing packages?

Are all your tools linked against thr 4.0.3 library?

Comment 12 Sami Farin 2001-09-30 19:35:39 UTC
the only program which modifies /var/lib/rpm/* files is rpm-4.0.3-1.04.
Installing goes like this: rpm -Uvh *.rpm




Comment 13 Jeff Johnson 2001-10-04 15:49:36 UTC
I don't have any more suggestions, so I'm gonna close this bug.
If the problem is recurrent, reopen this bug, and give me some information
regarding what else is happening so that I can try to reproduce
the problem.