Bug 114810
Summary: | rpmq freezes on select | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Peter Wolfenden <pw> |
Component: | rpm | Assignee: | Jeff Johnson <jbj> |
Status: | CLOSED WONTFIX | QA Contact: | Mike McLean <mikem> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 7.2 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-02-04 18:55:13 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Peter Wolfenden
2004-02-03 02:49:49 UTC
The trace shows a deadlock. Try rpm-4.1.1 if you want concurrent access to the database. rpm-4.0.5 is already end-of-life. I don't care about "concurrent access" to the rpm database - serialized access would be just fine. I simply want all rpm operations to succeed or fail without deadlocks or database corruption, and I don't want to have to "wrap" them with my own custom contention resolution system to achieve this. Are you saying that deadlocking behavior is a known and accepted problem in the 4.0.5 series? I tried my "four Perl script test" (see the initial description above) with version 4.1.1-1.8x, and the results were infinitely *worse* - after half an hour, the rpm database became corrupted, so much so that the magic number in the 'Packages' file got mangled (look for the 'data' file below): ----------------------------------------- [root@localhost rpmq]# file /var/lib/rpm/* /var/lib/rpm/Basenames: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Conflictname: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Dirnames: Berkeley DB (Btree, version 8, native byte-order) /var/lib/rpm/Filemd5s: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Group: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Installtid: Berkeley DB (Btree, version 8, native byte-order) /var/lib/rpm/Name: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Packages: data /var/lib/rpm/Providename: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Provideversion: Berkeley DB (Btree, version 8, native byte-order) /var/lib/rpm/Requirename: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Requireversion: Berkeley DB (Btree, version 8, native byte-order) /var/lib/rpm/Sha1header: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Sigmd5: Berkeley DB (Hash, version 7, native byte-order) /var/lib/rpm/Triggername: Berkeley DB (Hash, version 7, native byte-order) I say this is "infinitely worse" because you can't fix this problem with 'rpm --rebuilddb'. If you use rpm to manage your system packages (and we do), then the system is well and truly hosed when you get into a situation like this. Even a system where processes lock up once in a while is better than one which needs to be entirely rebuilt every once in the same while! As far as I'm concerned this bug can be closed if someone is able to run my "four Perl scripts" test for more than 48 hours with no deadlocking or database corruption with some "official" version of rpm. In this case, please indicate the version, and I will verify it on one of our systems. Yes, known problem in rpm-4.0.5, which effectively has no locking whatsoever. 30 minutes of repeated upgrades of an identical package is about what I would expect. There is no known ap[plication that needs even this degree of write access to the dtabase. Yes, you will need to arrange serialization outside of rpm if you wish to achieve running your scripts for 48 hours. Resolution is WONTFIX because rpm-4.0.5 is end-of-life. Obviously no application needs to upgrade and downgrade the same package continuously. The point of doing this is to reproduce the failure reliably in only 20 minutes, instead of having to wait 2-3 months for a "real" app to fail. Can't you please at least tell me some version (any version!) of rpm that is immune to the problem instead of repeating the fact that 4.0.5 is "end of life"? As I already indicated in comment #2 above, version 4.1.1-1.8x has a *worse* failure mode. At least version 4.0.5 never corrupts the rpm database (presumably because it *does* have locking - only there's a bug in the locking logic that sometimes causes deadlock). If there aren't any new versions of rpm that address this issue, then the issue isn't solved and the bug should remain open! Sure. rpm-4.0.2 had exclusive lock on /var/lib/rpm/Packages using fcntl about 3 years ago. That will kill the 2nd backgrounded perl script upgrade almost instantly with "can't open rpmdb". The issue is knowm, yelling louder ain't gonna change anything. Again, WONTFIX, because rpm-4.0.5 is end of life. Before trying rpm-4.0.5, we were using rpm-4.0.4x, which also uses an exclusive lock on the rpm database. And those "can't open rpmdb" messages are normal - in fact, I mention them in the "Expected results" of the original bug description. But rpm-4.0.4x also breaks after being subjected to my Perl scripts for about 20 minutes - one of the 'rpmq' processes hangs (but I don't have a backtrace for this). To summarize: RPM Version Behavior of my 4 Perl scripts ----------- ----------------------------- 4.0.4x rpm hangs after ~20-30 minutes 4.0.5 rpm hangs after ~20-30 minutes (deadlock) 4.1.1 rpm database becomes corrupted I'll try version 4.0.2 and see if this improves the situation. If so, I'll add a note here and thank you for your time. If not, I'll reopen this bug again. Correction - it turns out that none of my tests were run with version 4.1.1 of rpm. The tests that I had thought were run with version 4.1.1-1.8x were in fact run with version 4.0.5. Sorry for the confusion. Unfortunately, my organization is committed to a patched version of glibc 2.2, which precludes us from using rpm versions 4.1.1 and 4.2.* The general picture looks like this: rpm version glibc version behavior of my Perl scripts ----------- ------------- --------------------------- 4.0.4 2.2 rpm hangs after ~20-30 minutes (deadlock) 4.0.5 2.2 rpm database becomes corrupted (and rpm sometimes hangs) 4.1.1 2.3 ? 4.2.1 2.3 ? I would of course be interested to know the results of running my Perl scripts with later versions of rpm, but for for the purposes of serving my organization I'm stuck with the task of coming up with a fix for one of the 4.0 series versions. |