Bug 80617
Summary: | rpm --whatrequires ends with error message from db4 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux Beta | Reporter: | Göran Uddeborg <goeran> | ||||||||
Component: | rpm | Assignee: | Jeff Johnson <jbj> | ||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Mike McLean <mikem> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | beta3 | ||||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | i386 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | 4.2.2-0.8 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2004-01-02 22:08:42 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Göran Uddeborg
2002-12-28 20:50:26 UTC
Created attachment 88968 [details]
Output from rpm command on the test system
Probably true, although I can't reproduce, as root or non-root, using rpm-4.2-0.45, glibc-2.3.1-21, and kernel-smp-2.4.20-2.2, all of which are pertinent while rpm is switching to using posix mutexes. The underlying issue is that db-4.1.24, for mysterious reasons having to do with cache coherency in a db environment, reports close failures explicitly. The error message is dutifully reported by rpm, but afaict is always harmless (i.e. the fix will be to not print the error message.) Closing WORKSFORME because I can't reproduce with rpm-4.2-0.45. I still get this with glibc-2.3.1-36 rpm-4.2-0.56 and a kernel built from kernel-source-2.4.20-2.21 Are there even more packages involved? Could I have another chance and reopen this? I still see this on some machines, for example one with this configuration: kernel-source-2.4.20-8 db4-4.1.25-9 glibc-2.3.1-36 rpm-4.2.1-0.11 It is quite repeatable, but it happens only for some packages. In the hope it might give a clue, I enclose two straces, one where this happens and one where it does not. I notice a strange difference in the end. After having written the package name and version, rpmq tries to open /var/lib/rpm/Name in READ/WRITE mode. Naturally, it fails when I run this as non-root. This gives a "permission denied" error, which is what is written in the error message, so it looks suspiciously as the source of this problem. Also, the error message does not appear if I run the command as root. Created attachment 95706 [details]
Trace of rpmq where the problem appears
The command was: /usr/lib/rpm/rpmq -q kernel-source
Calls to gettimeofday were sorted out to reduce differences.
Created attachment 95707 [details]
Trace of a case where the problem does not appear
The command this time was: /usr/lib/rpm/rpmq -q libjpeg-devel
Again, gettimeofday is sorted out.
Are other processes accessing /var/lib/rpm concurrently? There is a late repoen triggered by db->close that might be expecting write access (non-root cannot create shared locks, so cache can __db* cache can change while running, possibly triggering the late reopen. Just a wild guess ...) There is (or was) a dangling ptr if header bit array was realloc'ed inopportunely. That's fixed in rpm-4.2.2-0.6. Without a reproducer, this is gonna be a bear to track down. Could you tar up and save /var/lib/rpm and attach a ptr here if you encounter a reproducible case? Thanks ;-) At http://www.carmen.se/pub/var.lib.rpm.tar.bz2 there is the contents of the machine where I most often see this happen. (As you can see from the contents, the machine has quite a mix of packages from different releases. As mentioned previously, I believe all relevant parts are up to date.) The bug can be reproducibly triggered for example with rpm -q --whatrequires libc.so.6 > /dev/null An "lsof" doesn't reveal any other processes having anything in /var/lib/rpm open. I haven't had the time to try if rpm-4.2.2-0.6 is different. I'll do that in a few days. Oh, and: You're welcome! :-) I cannot reproduce error as root or non-root using rpm-4.2.2 with your database. Of course that doesn't mean rpm-4.2.2 has fixed anything, but db-4.2.52 is used instead of db-4.1.25. I have your database, can/will try anything, but I can't fix what I can't see :-( I upgraded to 4.2.2-0.8. In my first tests I can't reproduce this problem any more after that! :-) Let me do a few more tests the next few days before closing this. But with some luck, it is fixed. (Maybe my db kept triggering that pointer you fixed in 4.2.2-0.6?) Sorry, didn't mean to close just yet. I've now exercised rpm in a number of ways, and this problem does indeed seem to be gone! :-) |