This bug relates to: Bugzilla Bug 73097 rpm-4.1 hangs, can't be killed: READ THIS FIRST http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=73097 Description of problem: rpm --rebuilddb cannot rebuild database. It does not respond to C-c key. Removing the lock files from in /var/lib/rpm do not help Version-Release number of selected component (if applicable): rpm 4.2 How reproducible: It's impossible to generate the situation, because it happened during standard RH9 install followed by call to up2date which hung at updating 'ypserv'. After that no rpm calls worked any more. Database was corrupt. up2date process had to be killed with -KILL, since it did not repsond to C-c key. Steps to Reproduce: 1. Install RH9 with kickstart + HTTP install (desktop) 2. Arrange up2date to be ready to update packages 3. Start up2date -u 4. At the server side, generate link error, so that up2date hangs Actual results: Expected results: 5. The rpm database is corrupt after that 6. rpm --rebuilddb does not restore the situation Additional info: $ rm -f /var/lib/rpm/__db* $ rpm --rebuilddb None of these restored the state. Rebuild hung and couldn't be stopped with C-c. For the record for users with similar problems, do this:I tried to restore that database state with: $ scp /var/lib/rpm/Packages debian.machine:/tmp/fixit/ I loggged in debian: $ apt-get install db4.1-util db4.1-doc $ cd /tmp/fixit $ db4.1_dump -r -f dumped Packages $ db4.1_load -f dumped Packges.ok Transferred that "Packages.ok" to the Redhat machine again, but a simple $ rpm -qa Gave huge list of errors, like: error: rpmdbnextIterator: skipping h#xxxxxx blob size(8): BAD 8 + 16 * il(xxxxxx) + dl(xxxxxx9 memory alloc (xxxxx bytes) returned NULL. According to db4.1_load(1) manual page if the database used user defined prefixes or comparision functions, it is impossible tdump and restore the database. Is this what RPM utilities are doing? Using custom settings, so that standard tools cannot repair the damage. If so, please change to use standard hash-database.
No, there are no "custon" utilities, only internal Berkeley db-4.1.25 compiled with --with-uniquename=_rpmdb to a) make an rpm build easier (i.e. don't have to configure/build db4) b) unique symbols to avoid symbol collisions. In fact, this is what was recommended by Sleepycat. All I can tell from above is that you have some damage, I'd need the xxxxx to guess what was damaged. Meanwhile, fix is probably possible if you give me a pointer (i.e. URL, attachments won't work) to the earliest possible (i.e. least "fixed") version of /var/lib/rpm that you still have.
Further info: As the problem persistested I did the following with standard RH 9: 1. Deletd /var/lib/rpm/Packages 2. rpm --initdb .. Then I used an awk script to extract the /roor/install.log file to get the list of installed files (that were supposed to be in RPM database) and make a shell for-loop to manually install all files again into newly created RPM DB. cd /tmp/redhat/all-rpms-here/ (Downloaded from a Mirror site) for $package in ....script to feed names ... do rpm --nodeps --force -Uvh $package done It took a night to install all rPMS to "database" again, although the actual packages were already in my machin. The informationwas just lost. However even this method did not restore Database. rpm -qa locked up as like before. After that I reinstalled whole RH9. I can't provide much further details for this error situation any more, so you can close this bug report after this message. However, I do that the original situation left in my Debian disk, 1. The initially corrupted database (Packages) 2. The one that was result of the db_dump + db_load to try to fix it, which gave the errors I mentioned (Packages.ok) You can download these from below. Hopefully you get something out of those to prevent similar lock up and corruptions in the future. http://tierra.dyndns.org:81/rh-data This link will cease to exist after some time after I have posted it.
This problem appears resolved. There's little that can be identified from looking at the database post mortem. The messages indicate lots of headers failing simple sanity checks. How that information got in the database cannot be determined.