Bug 101401

Summary: rpm hang while installing package
Product: [Retired] Red Hat Linux Beta Reporter: Charlie Brady <charlieb-redhat-bugzilla>
Component: rpmAssignee: Jeff Johnson <jbj>
Status: CLOSED WORKSFORME QA Contact: Mike McLean <mikem>
Severity: medium Docs Contact:
Priority: medium    
Version: beta1   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-07-31 18:50:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Charlie Brady 2003-07-31 18:19:27 UTC
Description of problem:

Hang occurred in Add/remove software widget (python 
SinglePackageWindow.py) on maybe fourth or fifth package installed. strace 
showed:

futex(0xbd0d2310, FUTEX_WAIT, 0, NULL

/var/lib/rpm/__db.00{1,2,3} existed. "rpm -qa" (and any other rpm operation)
then hung as well. Killing the process with -9, and removing the __db* files
allowed rpm to function again. 

I don't know whether this was reproducible or not, and I've now wiped the disk.

Comment 1 Charlie Brady 2003-07-31 18:27:35 UTC
It's relevant I'm sure that this was an upgrade from RH8, not a fresh install of
Severn, so the problem could be related to the existing rpm db files.

Comment 2 Jeff Johnson 2003-07-31 18:50:29 UTC
Yes, stale locks need to be manually removed after "kill -9".
There's no practical way to automate stale lock detection because
of races, i.e. detecting existence of locks and processes cannot
be done quickly enough, removing __db* files can/will introduce
windows where locking will not happen correctly. So "kill -9" cleanup
is left to the user as a pathological case.

Upgrade from RHL8 might be relevant, as glibc is changing the pthread
implementation foing from 8 -> 9. I know of no reproducible problems,
but odd corner cases, difficult to reproduce, might very well exist.

If you can reproduce a condition where stale locks are left, please
reopen this bug and I will try to analyze.