Description of problem: I've run 'yum info' command (see more details below) under high disk load and decided to kill it before waiting for results. It didn't die quickly after simple kill, so I've made kill -9 and yum left rpm database in bad state. Version-Release number of selected component (if applicable): yum-3.2.25-1.fc12.noarch, rpm-4.7.2-1.fc12.i686 How reproducible: Unknown, not willing to try that again on my working system. Steps to Reproduce: 1. Run a background process with high disk load (in my case it was k3b starting to burn a dvd). 2. Here is my session console log, use it as a guide to reproduce the problem. [root@abbot ~]# yum info -C zeroinstall-injector zerofree pangzero Loaded plugins: fastestmirror, presto, refresh-packagekit ^C^C^Z [1]+ Stopped yum info -C zeroinstall-injector zerofree pangzero [root@abbot ~]# kill %1 [1]+ Stopped yum info -C zeroinstall-injector zerofree pangzero [root@abbot ~]# [root@abbot ~]# [root@abbot ~]# fg yum info -C zeroinstall-injector zerofree pangzero ^Z [1]+ Stopped yum info -C zeroinstall-injector zerofree pangzero [root@abbot ~]# kill %1 [1]+ Stopped yum info -C zeroinstall-injector zerofree pangzero [root@abbot ~]# [root@abbot ~]# [root@abbot ~]# kill -9 %1 [root@abbot ~]# kill -9 %1 -bash: kill: (16712) - No such process [1]+ Killed yum info -C zeroinstall-injector zerofree pangzero [root@abbot ~]# yum info -C zeroinstall-injector zerofree pangzero rpmdb: Thread/process 16712/3079165632 failed: Thread died in Berkeley DB library error: db4 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages index using db3 - (-30974) error: cannot open Packages database in /var/lib/rpm CRITICAL:yum.main: Error: rpmdb open failed [root@abbot ~]# yum info -C zeroinstall-injector zerofree pangzero rpmdb: Thread/process 16712/3079165632 failed: Thread died in Berkeley DB library error: db4 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages index using db3 - (-30974) error: cannot open Packages database in /var/lib/rpm CRITICAL:yum.main: Error: rpmdb open failed Actual results: Corrupt rpmdb Expected results: Working rpmdb Additional info: Why on earth does yum go to the rpmdb if I simply run 'yum info' command. Shouldn't it just use it's own caches in this case? I was able to fix the problem by running these commands: cd /var/lib/rpm db_recover
This isn't rpmdb corruption, it's just BDB saying the previous access died in an uncontrolled manner while inside BDB code, which is a condition that isn't automatically cleared (whereas dying in application code while holding a read-only lock is automatically handled these days). Rpm is blocking the signals for a reason here: safe access - even read-only - in a concurrent setup such as the rpmdb requires locking. Bad things happening when you kill -9 a process while its blocking signals to protect critical sections is not a bug.
Well, from the user point of view it is a bug. You run a program, you kill it, and next time you try to install anything you need to bring out a geeky console and enter some cryptic magic running from root. If it is that safe to fix, either yum or rpm should autofix this.
(In reply to comment #1) > Rpm is blocking the signals for a reason here: safe access - even read-only - > in a concurrent setup such as the rpmdb requires locking. Bad things happening > when you kill -9 a process while its blocking signals to protect critical > sections is not a bug. That's why POSIX provides fcntl locks that are cleaned up automatically when the process is killed. So, use them.
Re comment #3: Lockes are used to serialize changes. The corollary to choosing locks that magically evaporate on abnormal (like kill -9) termination is that whatever inconsistencies/serialization MUST be dealt with. The existence (or "stale lock" cleanup on process termination) is hardly the issue. "Use fcntl locks!" is hardly a panacea; in fact its a useless piece of FUD.
(In reply to comment #4) > Lockes are used to serialize changes. The corollary to choosing > locks that magically evaporate on abnormal (like kill -9) termination > is that whatever inconsistencies/serialization MUST be dealt with. My proposal is that readers should take a fcntl read lock without modifying the rpmdb and writers should take an fcntl write lock in addition to modifying the rpmdb as they currently do. fcntl will enforce serialization, and the modification made by writers will ensure that the rpmdb is flagged as inconsistent after a writer terminates abnormally. The rpmdb is not inconsistent after a reader terminates abnormally. > The existence (or "stale lock" cleanup on process termination) is hardly > the issue. It is in this bug! Of course, maintaining proper serialization and recovery is the most important concern, and my proposal does that. > "Use fcntl locks!" is hardly a panacea; in fact its a useless piece of FUD. It's (the new element of) the solution for this bug.
The fcntl shared read/exclusive write lock was implemented in RPM in 2003 by Gusatvo Niemeyer, Do your homework. While you are correct that an rpmdb is not "inconsistent" while reading, it's very much not true that there is no state change. In fact, reading an rpmdb MUST have write access to create a shared reader lock with concurrent access. You don't know what state needs to be preserved, and you have not described (ine your wee widdle proposal) anything but an fcntl lock. Everything you claim about "my proposal does that" is naive ignorant FUD.
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.