This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 72148 - Rpm hangs (SIGTERM immune) when fed with unsolicited standard input.
Rpm hangs (SIGTERM immune) when fed with unsolicited standard input.
Status: CLOSED WONTFIX
Product: Red Hat Raw Hide
Classification: Retired
Component: rpm (Show other bugs)
1.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jeff Johnson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-08-21 10:47 EDT by Kuba Ober
Modified: 2008-05-01 11:38 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-08-21 15:15:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Kuba Ober 2002-08-21 10:47:10 EDT
From Bugzilla Helper: 
User-Agent: Mozilla/5.0 (compatible; Konqueror/3; Linux; X11; i686) 
 
Description of problem: 
When you use rpm, and type-ahead the future commands to be executed, rpm can 
hang requiring a SIGKILL. SIGTERM doesn't help. 
 
How reproducible: 
About 90% times, almost always with larger rpm workloads (many packages). 
Some packages seem to be immune. 
 
Steps to Reproduce: 
1. Run rpm installer (-U, -i) on almost any package. KDE packages are good to 
try. 
2. When rpm is running, type-ahead something for the command line (at least 
one full line with enter) 
3. Most of the time, rpm will hang after finishing the current package, or 
before exiting. 
4. kill -9 followed by cd /var/lib/rpm && db_recover are required to bring it 
back to a workable state 
	 
 
Actual Results:  rpm hangs, requiring SIGKILL 
 
Expected Results:  rpm finishes and exists cleanly 
 
Additional info: 
This bug seems to be present in all rpm 4.1 versions since the first or second 
limbo beta release. It is not present in rpm 4.0.x 
 
It's pretty inconvenient at times, as some people are used to type-ahead.
Comment 1 Jeff Johnson 2002-08-21 15:07:33 EDT
rpm has a database, rpm-4.1 now traps signals
in order to avoid stale database locks left
from interrupted installs. Providing a stronger
guarantee of data integrity has to be balanced with
type ahead.

I'm gonna mark this WONTFIX in the sense that rpm
needs to have a signal handler from now on.
Responsiveness will slowly improve (by polling for
signals more often), but the overriding goal is
data integrity, not typeahead responsiveness, in rpm.
Comment 2 Kuba Ober 2002-08-21 15:15:40 EDT
It's not about signal handler -- that would be another bug.  
 
This bug is about rpm going dead when it's present with unsolicited standard  
input!  
 
This is a regression against 4.0.x and it doesn't make sense for rpm to  
break dead when something comes on the standard input. Does standard input  
handling have anything to do with signal handling? As it is, rpm now hangs  
cold when you type anything on the keyboard while it's running. I don't think  
that's something that users expect nor desire.
Comment 3 Jeff Johnson 2002-08-21 15:26:25 EDT
rpm (4.1-0.85 if it matters) does not "hang"
when presented with endless 'a' characters
on stdin during a large upgrade: WORKSFORME
just now, feel free to try to duplicate.

Again, I suspect that the behavior you are seeing
has everything to do with signal handling, why
else would you put SIGTERM in the subject line?
Comment 4 Kuba Ober 2002-08-21 15:39:03 EDT
Okay, I'll try to get the exact steps to duplicate -- it seems that only 
certain kinds of jobs are susceptible to this. 
 
The SIGTERM thing was a side-effect of my main problem -- that after it hung 
due to blind-typing (i.e. presenting it with standard input), it was 
impossible to Ctrl-C it. But that's another story.
Comment 5 Jeff Johnson 2002-08-21 16:23:57 EDT
OK. Watch out for the 2 following effects that
might otherwise be interpreted as "hangs":
	1) On upgrade, erased packages are
	sorted to the end of the transaction,
	leading to an unpleasantly long delay without
	progress bars at the end of a transaction.

	2) SIGHUP/SIGTERM/SIGINT/SIGQUIT are all
	trapped while the database is open, and
	existence of the signal is checked for early exit
	processing when signals are unblocked, i.e. after
	most database operations. IMHO, this is the
	"hang" that you are reporting.
Comment 6 Kuba Ober 2002-08-21 16:36:28 EDT
1.On upgrade, erased packages are sorted to the end of the transaction.  
  
Then I assume that would happen again and in the same way after I restart it:  
  
killall -9 rpm  
cd /var/lib/rpm && db_recover  
rpm -Uhv ...  
 
Oh, I must have forgot to mention: the problem I'm seeing is that it's hung 
without any cpu usage!!! So unless you have some kind of O(1) sorting w/ 
dummy sleep() afterwards, it's not due to actual work being done. 
  
2. SIGHUP/SIGTERM/SIGINT/SIGQUIT are all trapped while the database is open. 
 
I don't care much. I'm using SIGKILL ;-) And again, that's a nicety I don't 
care about yet. What bothers me is that feeding rpm with unsolicited 
standard input makes it *sometimes* stop and hang (hand w/o cpu usage). 
 
I should strace it to see what it exactly hangs at. 
 
NB: I initially thought it was all due to using a database created by older 
db3/rpm4.0.x, but I removed the database and reinstalled all modules by 
hand just to make it isn't that.
Comment 7 Jeff Johnson 2002-08-21 16:55:13 EDT
No the database format is the same.

If it's truly a "hang", then it's due to
a database lock, possibly stale and persistent.
Do
	rm -f /var/lib/rpm/__db*
to eliminate the possibility of an old, stale
lock hanging out. FYI: handling the reference
count on the persistent /var/lib/rpm/__db*
files is the whole reason for trapping signals.

And, the /var/lib/rpm/__db* files are now persistent,
so you should only have to remove under rare and
exceptional conditions, like an rpm segfault. This
is new and different behavior in rpm-4.1, which
permits concurrent database access rather than
the traditional; exclusive/shared fcntl locking scheme.

If you are hanging on a database lock, attach strace,
and look for a steady heart beat of select's, about 1
per sec.
Comment 8 Jeff Johnson 2002-08-21 20:32:53 EDT
BTW, killall -9 is *exactly* the rare, execptional, and
pathological condition where
	rm -f /var/lib/rpm/__db*
is gonna be needed. Otherwise you *will* have stale
locks.

Note You need to log in before you can comment on or make changes to this bug.