Bug 72543

Summary: RPM hangs while installing a package
Product: [Retired] Red Hat Public Beta Reporter: Scott Lamb <redhat>
Component: rpmAssignee: Jeff Johnson <jbj>
Status: CLOSED NOTABUG QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: nullCC: alex.danielski, blackhat, kumarpindia, public, p.van.egdom
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-08-30 17:17:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Scott Lamb 2002-08-25 03:01:14 UTC
rpm hangs during package installation. First, it nearly completed:

# rpm -Uvh compat-gcc-7.3-2.96.110.i386.rpm 
warning: compat-gcc-7.3-2.96.110.i386.rpm: Header V3 DSA signature: NOKEY, key
ID 897da07a
Preparing...                ########################################### [100%]
   1:compat-gcc             ########################################### [100%]

and hung, completely unresponsive to SIGINT or SIGTERM. The files did apparently
get installed (/usr/bin/gcc296 exists). I attached with gdb and got this:

    (gdb) bt
    #0  0x081870fe in select ()
    #1  0x081ff378 in _GLOBAL_OFFSET_TABLE_ ()
    #2  0x080fa951 in __os_yield_rpmdb ()
    #3  0x080c167f in __db_tas_mutex_lock_rpmdb ()
    #4  0x080f2137 in __lock_get_internal ()
    #5  0x080f196f in __lock_get_rpmdb ()
    #6  0x080db962 in __db_c_put_rpmdb ()
    #7  0x08090f6f in db3cput ()
    #8  0x0808e1ec in rpmdbAdd ()
    #9  0x08061f5f in psmStage ()
    #10 0x08061670 in psmStage ()
    #11 0x08061b81 in psmStage ()
    #12 0x0807bf94 in rpmtsRun ()
    #13 0x0806cc95 in rpmInstall ()
    #14 0x08048e4d in main ()

I did a 'kill -9' and tried to use rpm again. It now hangs without doing
anything at all now, whether it's another package installation or a query. This
is happening very consistently.

"rpm --version" says "RPM version 4.1", the one that comes with (null).
(Ironically, I can't do a rpm -q rpm).

Comment 1 Scott Lamb 2002-08-25 08:50:42 UTC
I managed to recover from this. "rpm --rebuilddb" had the same problem, but I
realized that this is just a Berkeley DB database. A "db_recover -e" in
/var/lib/rpm worked wonders. That install was forgotten about (the transaction
rolled back, I guess). The next time I ran it I got errors like this:

    rpmdb: illegal flag specified to DB->cursor
    error: db4 error(22) from db->cursor: Invalid argument

but it seemed to work. and after that, everything worked fine.

Comment 2 Jeff Johnson 2002-08-25 13:18:45 UTC
If you insist on doing "kill -9", then you *will*
have stale locks that need to be removed by doing
	rm -f /var/lib/rpm/__db*

Comment 3 Scott Lamb 2002-08-25 17:48:39 UTC
I did the 'kill -9' _after_ it hung the first time. Didn't see a whole lot of
other choices...

Comment 4 Hakan TERZIOGLU 2002-08-27 03:39:24 UTC
I couldn't understand why this bug is closed with 'not a bug'. 
When ever i try to install an rpm , %40 of the time rpm command stalls like 
explained above. First i ignored it and reboot to make it work again (yeah it 
does) but now it makes me really disappointed. So i traced for the solution. 
This is what i figured out: 
/var/lib/rpm/Packages which is in Berkley db format stays open so no other new 
rpm commands cannot be executed properly (they hang like the first hanged one) 
I figured out this by `lsof /var/lib/rpm/Packages` 
When you kill (with sig -9 )the first hanged rpm command and rebuild rpm db 
everything goes smoothly again. 
As a conclusion it must be a bug.

Comment 5 Jeff Johnson 2002-08-27 12:17:03 UTC
It's NOTABUG because:
	1) there's no way to trap "kill -9" in order
	to do the necessary cleanup. Use SIGHUP, SIGINT,
	SIGTERM, or SIGQUIT instead of -9.
	2) there's no way to tell which locks are real
	and which are stale on the next execution of rpm.
	Preventing stale locks by trapping the signals above
	is the better solution.
So don't use "kill -9", use "kill" instead.
	

Comment 6 Peter van Egdom 2002-08-29 19:22:02 UTC
I noticed a similar problem when I deinstalled an old rpm package from my
'null' system to make place for a newer one from Rawhide :

[root@localhost tmp]# rpm -e redhat-config-keyboard-0.9.9-6
error: Failed dependencies:
        redhat-config-keyboard is needed by (installed) firstboot-0.9.9-13
[root@localhost tmp]# rpm -e redhat-config-keyboard-0.9.9-6 --nodeps

     <<<< here RPM just hangs, no progress, nothing >>>>
    (waited for about 5 minutes, should've taken 5 seconds)

(note: there is no redhat-config-keyboard process active).

<CTRL-C> does not help to abort the RPM process. I really have to kill -9 that
process. Note that I tried other kill signals.

[root@localhost TESTING]# ps -ef |grep redh
root      3953  2303  0 20:53 pts/7    00:00:00 rpm -e redhat-config-keyboard-0.
root      4061  4022  0 21:17 pts/3    00:00:00 grep redh
[root@localhost TESTING]# kill -3 3953
[root@localhost TESTING]# kill -2 3953
[root@localhost TESTING]# kill -1 3953
[root@localhost TESTING]# ps -ef |grep redh
root      3953  2303  0 20:53 pts/7    00:00:00 rpm -e redhat-config-keyboard-0.
root      4063  4022  0 21:17 pts/3    00:00:00 grep redh
[root@localhost TESTING]# kill -9 3953
[root@localhost TESTING]# ps -ef |grep redh
root      4065  4022  0 21:17 pts/3    00:00:00 grep redh
[root@localhost TESTING]# 



Comment 7 Scott Lamb 2002-08-30 03:44:43 UTC
It locked for me again. Unresponsive to anything but kill -9.

$ sudo rpm -Uvh up2date-*
warning: up2date-2.9.55-1.i386.rpm: Header V3 DSA signature: NOKEY, key ID 897da07a
Preparing...                ########################################### [100%]
   1:up2date                warning: /etc/sysconfig/rhn/up2date created as
/etc/sysconfig/rhn/up2date.rpmnew
########################################### [ 50%]

0x08181a41 in __libc_nanosleep ()
(gdb) bt
#0  0x08181a41 in __libc_nanosleep ()
#1  0x0814bf4b in nanosleep ()
#2  0x0818196f in sleep ()
#3  0x0805fd80 in psmWait ()
#4  0x08060241 in runScript ()
#5  0x08060838 in runInstScript ()
#6  0x08062a53 in rpmpsmStage ()
#7  0x080623db in rpmpsmStage ()
#8  0x080628a5 in rpmpsmStage ()
#9  0x0807cce0 in rpmtsRun ()
#10 0x0806da15 in rpmInstall ()
#11 0x08048e4d in main ()
#12 0x0815aa12 in __libc_start_main ()

rpm-4.1-0.87.

It left locks lying around, I deleted them.

Would it be helpful if I compiled the srpm with debugging flags and gave a
backtrace with line numbers and such?

Comment 8 Scott Lamb 2002-08-30 17:17:50 UTC
I'm reopening because

> It's NOTABUG because:
>     1) there's no way to trap "kill -9" in order
>     to do the necessary cleanup. Use SIGHUP, SIGINT,
>     SIGTERM, or SIGQUIT instead of -9.

SIGHUP, SIGINT, SIGTERM, and SIGQUIT do not work.


Comment 9 Jeff Johnson 2002-08-30 17:21:12 UTC
Please reopen a different bug, not this
bug.

Comment 10 Peter van Egdom 2002-08-30 17:53:53 UTC
This bug was reopened because RPM sometimes hangs while installing
- or deinstalling - a package.

(I've encountered this bug a couple of times now, when I was deinstalling
packages for replacing these with newer ones from Rawhide).

The phenomenon that RPM can only be killed by a 'kill -9' is the result of
this bug.

The first sentence of this Bugzilla entry is:
"rpm hangs during package installation."

and this Bugzilla entry has the subject:
"RPM hangs while installing a package".

So this bug can be reopened. :-)

I'll open a new separate bug because of the 'kill signal situation' when
this bug happens.



Comment 11 warren 2004-06-05 18:20:07 UTC
this is indeed a bug there is no way around it ... i have found way
too many experienced users that this is happening to .. you should not
have to rebuild the rpm db so many times if ever .. i have rebuilt
mine about 20 times to fix this problem without a reboot ..

"fix"
lsof /var/lib/rpm/Packages
kill -9 <pid>
rm -f /var/lib/rpm/__db*
rpm -vv --rebuilddb

-warren

Comment 12 puneet kumar 2005-05-31 10:55:48 UTC
thanks warren. your solution rocks.

i got myself in trouble as Peter did, by trying to de-install cdrao. rpm hanged
for entire night before i killed (kill -9) it next morning. nothing worked
afterwards.

i saw this solution by you, executed last 2 command. everything now is fine.

thanks again.

Comment 13 Alex S 2019-05-02 14:43:11 UTC
This happened to me in RHEL 7.6! Thanks to warren for that solution!

How was this closed as NOTABUG in 2002 when it's STILL a bug? I am lookin at you, Jeff Johnson.

Peter is right on the money:

This bug was reopened because RPM sometimes hangs while installing
- or deinstalling - a package.

(I've encountered this bug a couple of times now, when I was deinstalling
packages for replacing these with newer ones from Rawhide).

The phenomenon that RPM can only be killed by a 'kill -9' is the result of
this bug.

The first sentence of this Bugzilla entry is:
"rpm hangs during package installation."

and this Bugzilla entry has the subject:
"RPM hangs while installing a package".

So this bug can be reopened. :-)