Bug 75553 - Rpm hangs when using rpm -e
Summary: Rpm hangs when using rpm -e
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: rpm
Version: 8.0
Hardware: i686
OS: Linux
medium
low
Target Milestone: ---
Assignee: Jeff Johnson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-10-09 21:12 UTC by Need Real Name
Modified: 2008-05-01 15:38 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2002-10-31 20:25:44 UTC
Embargoed:


Attachments (Terms of Use)
Results of rpm -e -vv (14.23 KB, text/plain)
2002-10-09 22:11 UTC, Need Real Name
no flags Details
rpm -evv that succeeds (1.21 KB, text/plain)
2002-10-28 19:16 UTC, jbowman
no flags Details
rpm -evv that stalls (1.85 KB, text/plain)
2002-10-28 19:18 UTC, jbowman
no flags Details

Description Need Real Name 2002-10-09 21:12:03 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.1)
Gecko/20020823 Netscape/7.0 (BDP)

Description of problem:

I was told by jbj to open a new
bug for this issue becuase it was NOT the same
as # 74726.  Originally, I reported the same issue
with the default version of rpm that ships with 8.0

Upgraded to version 4.1-9 test rpm packages (per
jbj) but rpm is stil hanging.  I managed to 
successfully remove six packages with rpm -e 
but then immediately tried to remove two more 
and it hung again  If I just kill the proc with 
kill -9, rpm will not function.  Once I remove 
the __db* files, rpm will function again.

(strace follows)

...
open("/var/lib/rpm/Packages", O_RDONLY|O_LARGEFILE) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=10727424, ...}) = 0
brk(0x8260000)                          = 0x8260000
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 64000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 128000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 256000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 512000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
[continues]

-Pat

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.  Just try removing packages using rpm -e.  Seems to happen ever 4th or 5th
package I try to remove but note that sometimes it happens sooner (after 1 or 2)
sometimes later after 10 or 12.
	

Additional info:

Comment 1 Jeff Johnson 2002-10-09 21:26:15 UTC
OK, you appear to have a new problem.

However, I have the following questions first:

1) Does "immediately" mean simultaneously?

2) If you "kill -9" a running rpm, you *will*
have to do "rm -f /var/lib/rpm/__db*" to fix.
Are you terminating rpm through exceptional
(e.g. kill -9) intervention frequently?



Comment 2 Need Real Name 2002-10-09 21:34:07 UTC
No.  Immediately does not mean sumultaneously.  I allowed
the first command to completely finish removing the first
six rpms.  Then, after those six were successfully removed,
I tried to remove the next two.  I used something similar to
the following

# rpm -e pkg1 pkg2 pkg3 pkg4 pkg5 pkg6
[successful]
# repm -e pkg1 pkg2
[hang - prompt does not return]

Yes I am having to use kill -9.  A lower priority kill
does not work.  After I kill the process with kill -9,
I remove the __db* files and then rpm will function again
until it hangs the next time.  Then I go through the
same process over again.  This is happening quite often.
I've been able to reproduce the issue on three
completely separate installations of RH 8.0, all
i386.

-Pat




Comment 3 Jeff Johnson 2002-10-09 21:53:10 UTC
The aftermath of "kill -9" is less interesting
(to me&rpm) than the initial hang, please adjust
your comments accordingly.

What packages were involved in the initial hang?


Comment 4 Need Real Name 2002-10-09 22:00:52 UTC
rpm -e ypbind ypserv nfs-utils fam portmap yp-tools
[successful]
rpm -e gdk-pixbuf-gnome gdk-pixbuf-devel
[hang]

I was trying to remove completed different packages
from the other two systems so I don't think the
issue is tied to any particular package being
removed.

-Pat


Comment 5 Jeff Johnson 2002-10-09 22:03:29 UTC
If you can reproduce, could you add -vv and
append output here? Apologies for having you
do the heavy lifting.

Comment 6 Need Real Name 2002-10-09 22:11:05 UTC
Created attachment 79711 [details]
Results of rpm -e -vv

Comment 7 Need Real Name 2002-10-09 22:13:43 UTC
Please note that rpm hung during the above
attached output - it did not finish.  I used:

# rpm -e -vv cups-libs qt samba-common samba-client unixODBC

and just picked four packages at random to remove.

-Pat

Comment 8 Jeff Johnson 2002-10-10 15:31:52 UTC
Sanity check: Did you "rm -f /var/lib/rpm__db*" before
attempting the erase?

I've tried several variants of erase from a chroot
install, no hang yet, certainly not an easily reproduced
hang. Caveat: my box is SMP, that may make a difference.

Comment 9 Need Real Name 2002-10-10 16:04:59 UTC
No I did not do a rm -f /var/lib/rpm__db* because the 
previous rpm -e was successful.  There should not have 
been any __db* files there.  If there _were_ any __db*
files left behind from the previous rpm -e, they were
not dealt with correctly when rpm completed successfully.

Surely, you're not gonna tell me I have to manually check 
and remove these files after every successful use of rpm.
Right?

-Pat


Comment 10 Jeff Johnson 2002-10-10 16:10:07 UTC
The __db files are persistent in rpm-4.1,
should always be present after creation, can
be manually removed at any time that rpm is
not active, cannot be removed by rpm bcause
that opens up lock race windows.

No, I'm not telling you that you have to remove
those files after every successful execution
of rpm. I'm asking whether you removed those
files before attempting a reproducible test case.
If not, I can't interpret the results of your test.

Comment 11 Need Real Name 2002-10-10 16:45:50 UTC
Like I said, I did not delete those files before I
ran my test case because I had already deleted them
10 minutes before that when rpm hung up.  rpm -e was
working fine until I tried to run the test case
you asked for.  Thats why I asking - do you
want me to manually remove those files after EVERY
rpm -evv I do in order to reproduce the issue for you?

-Pat



Comment 12 Jeff Johnson 2002-10-10 16:59:41 UTC
I need to know that there aren't stale locks
from something else. The following sequence
should isolate:
	0) rm __db files
	1) run "rpm -evv"  that succeeds (__db files will exist after)
	2) run "rpm -evv" that hangs, send me this log if different


Comment 13 Need Real Name 2002-10-10 23:03:35 UTC
Ok.  I spent the better part of the afternoon trying
to reproduce this per your instructions.  I'm using 
RH 8.0 and rpm version 4.1-9.

1.  rpm -e pkg1 pkg2 pkg3 ... pkg[n]
[completes successfully]
2.  rm -rf /var/lib/rpm/__db*
[complets successfully]
{repeat step 1 and 2)

I continued this process until well... (laughing) the
system doesn't have much left on it anymore.  I doubt
it will even reboot.  However, rpm did NOT freeze or
hang so I have no other rpm -evv report to attach.
But I HAVE to delete those __db* files after every use
of rpm otherwise it will hang.  Looks like those 
lock files are causing it.

-Pat





Comment 14 Pete Zaitcev 2002-10-11 05:48:28 UTC
See also Bug 68056.


Comment 15 Jeff Johnson 2002-10-11 14:04:13 UTC
Well, I suggested that you remove the __db* files
once, not each and every time. Apologies if that
wasn't crystal clear.

I've tried (and cannot) reproduce this bug in a chroot,
so I'm going to close.


Comment 16 Nerijus Baliūnas 2002-10-19 01:17:54 UTC
Could you please try with UP machine? If you cannot reproduce the bug with SMP,
isn't it a logical step? What glibc are you using in your chroot environment?
Besides, the attachment of preich is from the situation which
you wanted:
0) rm __db files - he did that, as otherwise he could not rpm -e.
1) run "rpm -evv"  that succeeds (__db files will exist after) - he did that
2) run "rpm -evv" that hangs, send me this log if different - he did that.


Comment 17 jbowman 2002-10-28 19:14:27 UTC
I can reproduce this 100% reliably on my freshly-kickstarted Dell 2650 PowerEdge.

Steps to reproduce (the specific packages don't seem to matter)

rm -fr /var/lib/rpm/__rpm*
rpm -e somepackage
rpm -qa|less
rpm -e anotherpackage

It begins select() cycling as described at the second rpm -e. I'm attaching -vv
output from the two -e's I've used as a test case. Capturing -evv out of the
-qa|less seems tricky though. Suggestions for that?

Reopening this. I'm more than willing to help in debugging, 'cause this is
driving me up a wall.

Comment 18 jbowman 2002-10-28 19:15:25 UTC
Err, oops, I guess I can't reopen it. Silly me. :)

Comment 19 jbowman 2002-10-28 19:16:57 UTC
Created attachment 82424 [details]
rpm -evv that succeeds

Comment 20 jbowman 2002-10-28 19:18:04 UTC
Created attachment 82425 [details]
rpm -evv that stalls

Comment 21 Need Real Name 2002-10-29 01:07:38 UTC
Thought I'd chime in again.  This issue remains open for me because it's still
happening.  I've tested on three separate test systems all with the same rpm
hang.  Becuase a fix doesn't appear to be forthcoming yet, I'm abandoning my
plan to migrate to 8.0.  I'll keep monitoring for additional information and
reports.

Comment 22 Need Real Name 2002-10-31 20:25:37 UTC
Just as an aside, I've also seen this with RedHat 8.  On my home machine, my
work machine, and one other machine at work (which is a perfect 3 for 3, as
those are the only machines that I have seen with RedHat 8).


Comment 23 Jeff Johnson 2002-11-01 14:43:15 UTC
I cannot tell anything meaningful from "me too" reports.

All I can get from the above is that there may be a different
problem with -e than with -U (which I will try to reproduce).

So I'm gonna close this bug. Fell free to reopen Yet More
Reports, but *please* try to
	a) report the exact version of rpm you are using
	b) try to supply a reproducible test case

Comment 24 jbowman 2002-11-01 14:50:57 UTC
Huh? My case was both more than  "me too" *and* was reproducible, at least on my
end, *every* *single* *time*. The rpm version I'm using (and was using, for
those rpm -evv reports that you asked for and I provided) is rpm-4.1-1.06, the
stock version that ships with RedHat 8.0. What other information do you need?

Comment 25 Need Real Name 2002-11-01 15:03:41 UTC
I agree.  Closing this bug is utterly ridiculous (except possibly for closing at
as a dupe of the other RPM hang bugs).

The exact steps to reproduce on EVERY system are not yet clear, but it is clear
that this is affecting a LOT of people in a LOT of different environments, and
that it has NOT been fixed.

I think the bug should stay open until someone can show that it does NOT exist.



Note You need to log in before you can comment on or make changes to this bug.