Bug 77997

Summary:	RPM caught in select() loop.
Product:	[Retired] Red Hat Linux	Reporter:	Matt Bottrell <mbottrell>
Component:	rpm	Assignee:	Paul Nasrat <nobody+pnasrat>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	8.0	CC:	caruso, samuel
Target Milestone:	---
Target Release:	---
Hardware:	athlon
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2005-04-19 19:03:00 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Matt Bottrell 2002-11-17 02:54:56 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 
1.0.3705)

Description of problem:
When running the rpm commands fine -- however hangs with the following:

select(0, NULL, NULL, NULL, {0, 199206}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. rpm -ivh nmh-1.0.4-15.i386.rpm
2. (new term)  ps auxw |grep rpm
3. strace -p (PID from above).

This happens with ALL rpm commands.  (Including rpm, rpmquery, etc.   Could be 
librpm playing up.
	

Actual Results:  Command prompt never returns.  Cannot Ctrl-C and needs to be 
killed via a kill -9.

Looks like the STDOUT/STDIN/STDERR is not closed correctly???

Expected Results:  Should return to the propmt and exit cleaning.

Additional info:

Have installed the default librpm and rpm packages that come with Redhat 8.0.  
Default vanilla system.

Happens on more than one system.

Have installed the following:

librpm404-4.0.4-8x.27.i386.rpm
rpm-4.1-1.06.i386.rpm

Comment 1 Stephen Samuel 2002-11-18 01:21:58 UTC

Ran into a similar problem.  It appears to ONLY lock up when running as root.
When running as non-root, it does not lock up.
(locking problem?)

It can also be killed by kill -ALRM $rpm_PID

Installed: 
redhat-rpm-config-8.0-1
rpm-devel-4.1-1.06
librpm404-4.0.4-8x.27
rpm-build-4.1-1.06
rpm404-python-4.0.4-8x.27
rpm-4.1-1.06
rpm2html-1.7-8
rpm-python-4.1-1.06

The log of the command that originally locked up:
[root@me gotten]#  rpm -ivh  mod_perl-1.99_05-3.i386.rpm  mod_ssl-2.0.40-8.i386.rpm
Preparing...                ########################################### [100%]
   1:mod_ssl                warning: /etc/httpd/conf/ssl.crl/Makefile.crl saved
as /etc/httpd/conf/ssl.crl/Makefile.crl.rpmorig
warning: /etc/httpd/conf/ssl.crt/Makefile.crt saved as
/etc/httpd/conf/ssl.crt/Makefile.crt.rpmorig
########################################### [ 50%]

-----

That's the last output I got.

Comment 2 Jeff Johnson 2002-11-18 14:10:42 UTC

There are several interlocked problems here
is my guess. The starting point may have been
a missed SIGCHLD, fixed in rpm-4.1-9 packages
at
	ftp://people.redhat.com/jbj/test-4.1
Otherwise see #73097, and try to supply information
about where the stale locks came from.

Comment 3 Stephen Samuel 2002-11-18 16:58:35 UTC

For the install, to get around the lock problem, I did an 
   rpm -Fvh --force rpm-4.1-9.i386.rpm

After that, rpm does seem to work for root again.

(I never tried it without the --force)

Comment 4 Need Real Name 2002-12-23 21:06:20 UTC

Jeff asked for information regarding where the stale locks come from.  Assuming 
this refers to the __db.00* files, I find it difficult to find an rpm command 
that does NOT leave them behind (despite executing successfully).  rpm -qa does 
it, rpm --rebuilddb does it, the standard /etc/cron.daily/rpm job (which is 
just doing an rpm -qa) does it, etc.  The files are always there on my 8.0 
system.  This is using the stock rpm-4.1-1.0.6.

The hangs don't occur consistently just because the __db.00* files exist, 
though.  I had a hang today (with the usual "select(0, NULL, NULL, NULL, {1, 
0}) = 0 (Timeout)" behavior) on a completely "idle" machine--basically a 
freshly installed machine that's now just sitting there, with no users and only 
the default system processes running.  It was the first RPM installation I'd 
done on there in many months (I was updating to wget-1.8.2-5 and xinetd-2.3.7-
5, using rpm -Fvh).  But after killing that rpm process, removing the 
lockfiles, and rebuilding the database (and so recreating the __db.00* files), 
I was able to install the two RPMs again without a hang.

Comment 5 Jeremy Katz 2005-04-19 19:03:00 UTC

There have been a lot of fixes in this area.  If this problem still occurs on
more current releases, please reopen the bug with more information on the
current failures.