55920 – rpm hanging when doing -qa, breaks up2date

Bug 55920 - rpm hanging when doing -qa, breaks up2date

Summary: rpm hanging when doing -qa, breaks up2date

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	rpm
Sub Component:
Version:	7.2
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Jeff Johnson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-11-08 19:07 UTC by Simon Josefsson
Modified:	2008-05-01 15:38 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2001-11-09 09:17:55 UTC
Embargoed:

Attachments	(Terms of Use)
strace output from running rpm -qa (63.76 KB, text/plain) 2001-11-08 19:08 UTC, Simon Josefsson	no flags	Details
View All

Description Simon Josefsson 2001-11-08 19:07:46 UTC

Description of Problem:

When I do "rpm -qa", it spews out a number of RPM's and then hangs.  
up2date and other programs that uses rpm hangs as a consequence.

Version-Release number of selected component (if applicable):

4.0.3

How Reproducible:

Always.

Steps to Reproduce:
1. rpm -qa

Actual Results:

See the attached strace log.

Comment 1 Simon Josefsson 2001-11-08 19:08:32 UTC

Created attachment 36921 [details]
strace output from running rpm -qa

Comment 2 Bill Nottingham 2001-11-09 02:17:36 UTC

If you run:

rm -f /var/lib/rpm/__db*

does the problem go away?

Comment 3 Simon Josefsson 2001-11-09 09:17:50 UTC

Yes, deleting __db* works fine. Thanks!

Do you know what caused this?  Crashing machines while rpm was working?  Maybe 
rpm/up2date could delete those files automatically (if it is safe to do so, of 
course).

Comment 4 Jeff Johnson 2001-11-09 17:31:59 UTC

Using ^C with rpm as root, but querying as user causes the problem.

Problem appears resolved.

Comment 5 Simon Josefsson 2001-11-09 17:48:35 UTC

I can reproduce this by ^C'ing "rpm -qa" as root.

I don't think this is intended behaviour, rpm should not permanently break 
because you ^C "rpm -qa" once.  It should use a ^C sighandler to clean up 
after itself.  Or even better, detect stale __db* files on startup.

Comment 6 Jeff Johnson 2001-11-09 18:47:43 UTC

The problem is not signal handling, but rather a
reference count that is left in the __db file.
The fix is more complicated than handling signals
(which are already handled), and will involve
removing the database object from the rpmlib API.
This has already been done.

The __db files are being removed on all of rpm exec,
rpm exit and even on system reboot. That doesn't work
when root does ^C, and user cannot remove file.
The traditional remedy, making rpm setgid, is underway,
but the better solution yet is to get POSIX shared mutexes
implemented in the kernel, as write access to the lock
is needed even if the rpm database is opened read only.

Oh yeah, the goal of this whole exercise is to permit
concurrent write access to the rpm database.

And, finally, rpm doesn't exactly "permanently break"
when the problem occurs. Nuke the __db files and be happy.

Note You need to log in before you can comment on or make changes to this bug.