Bug 221700 - non-FUTEX rpm deadlock
Summary: non-FUTEX rpm deadlock
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rpm
Version: 5.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Paul Nasrat
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-06 11:06 UTC by Jan Kratochvil
Modified: 2007-11-30 22:07 UTC (History)
0 users

Fixed In Version: rpm-4.4.8-0.4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-04-09 13:05:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
bzip(1) of the core dump of "yum update". (8.53 MB, application/octet-stream)
2007-01-06 11:06 UTC, Jan Kratochvil
no flags Details
.tar.bz2 of /var/lib/rpm (14.58 MB, application/octet-stream)
2007-01-06 11:17 UTC, Jan Kratochvil
no flags Details
rpm -qa (35.22 KB, text/plain)
2007-01-06 11:19 UTC, Jan Kratochvil
no flags Details

Description Jan Kratochvil 2007-01-06 11:06:34 UTC
Description of problem:
After installing RHN on FCx->...->FC6->RHEL5 yum(8) updated system I got lockup
of "yum update".

Version-Release number of selected component (if applicable):


How reproducible:
Not tried. "rpm -qv anything" was locking up at FUTEX always.

Steps to Reproduce:
1. rpm -i rhn-setup-gnome-0.4.5-1.el5.noarch.rpm
rhn-setup-0.4.5-1.el5.noarch.rpm rhn-client-tools-0.4.5-1.el5.noarch.rpm
rhnsd-4.5.7-1.el5.i386.rpm rhnlib-2.2.5-1.el5.noarch.rpm
rhn-check-0.4.5-1.el5.noarch.rpm usermode-gtk-1.88-3.el5.i386.rpm
gnome-python2-canvas-2.16.0-1.fc6.i386.rpm pyOpenSSL-0.6-1.p24.7.2.2.i386.rpm
yum-rhn-plugin-0.4.1-1.el5.noarch.rpm 
/tmp/rhn-org-trusted-ssl-cert-1.0-1.noarch.rpm
2. rhn_register etc.
3. yum update

Actual results:
Loading "rhnplugin" plugin
Setting up Update Process
Setting up repositories
livna-development         100% |=========================| 1.1 kB    00:00     
rhel-5                    100% |=========================|  951 B    00:00     
rhel-i386-client-5-beta   100% |=========================|  951 B    00:00     
rhel-5-debuginfo          100% |=========================|  951 B    00:00     
extras-development        100% |=========================| 1.1 kB    00:00     
Reading repository metadata in from local files
primary.xml.gz            100% |=========================| 1.0 MB    01:22     
################################################## 3325/3325
primary.xml.gz            100% |=========================| 1.3 MB    00:09     
################################################## 4010/4010
Excluding Packages in global exclude list
Finished
[lockup]

Expected results:
Completed yum(8) run.

Additional info:
After attaching by gdb(1) I have seen:
0x001d9a03 in __memp_fput_rpmdb (dbmfp=0x948bab0, pgaddr=0xb78761b4, 
    flags=<value optimized out>) at ../db/mp/mp_fput.c:217
217                 prev = fbhp, fbhp = SH_TAILQ_NEXT(fbhp, hq, __bh))
(gdb) bt
#0  0x001d9a03 in __memp_fput_rpmdb (dbmfp=0x948bab0, pgaddr=0xb78761b4,
flags=<value optimized out>)
    at ../db/mp/mp_fput.c:217
#1  0x001564ff in __ham_release_meta_rpmdb (dbc=0x96fbd80) at
../db/hash/hash_meta.c:70
#2  0x0014f487 in __ham_c_get (dbc=0x96fbd80, key=0x970a08c, data=0x970a0a4,
flags=18, pgnop=0xbf8a3ba0)
    at ../db/hash/hash.c:593
#3  0x00199e8e in __db_c_get_rpmdb (dbc_arg=0x948b4f0, key=0x970a08c,
data=0x970a0a4, flags=18) at ../db/db/db_cam.c:654
#4  0x001a07e6 in __db_c_get_pp_rpmdb (dbc=0x948b4f0, key=0x970a08c,
data=0x970a0a4, flags=18) at ../db/db/db_iface.c:1741
#5  0x00130916 in db3cget (dbi=0xa300770, dbcursor=0x0, key=0x970a08c,
data=0x970a0a4, flags=855264) at db3.c:612
#6  0x0012c543 in rpmdbNextIterator (mi=0x970a070) at rpmdb.h:591
#7  0x0055c729 in rpmmi_iternext (s=0xb7f6d750) at rpmmi-py.c:89
#8  0x00ac7e15 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#9  0x00acb52f in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#10 0x00accc68 in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#11 0x00a7ed40 in PyClassMethod_New () from /usr/lib/libpython2.4.so.1.0
#12 0x00a66d57 in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#13 0x00a6d358 in PyClass_IsSubclass () from /usr/lib/libpython2.4.so.1.0
#14 0x00a66d57 in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#15 0x00ac648c in PyEval_CallObjectWithKeywords () from /usr/lib/libpython2.4.so.1.0
#16 0x00a71100 in PyInstance_New () from /usr/lib/libpython2.4.so.1.0
#17 0x00a66d57 in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#18 0x00ac9548 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#19 0x00acb52f in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#20 0x00accc68 in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#21 0x00acb426 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#22 0x00acb52f in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#23 0x00acb52f in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#24 0x00accc68 in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#25 0x00acb426 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#26 0x00accc68 in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#27 0x00acccf3 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#28 0x00ae9998 in Py_CompileString () from /usr/lib/libpython2.4.so.1.0
#29 0x00aeb0a8 in PyRun_SimpleFileExFlags () from /usr/lib/libpython2.4.so.1.0
#30 0x00aeb78a in PyRun_AnyFileExFlags () from /usr/lib/libpython2.4.so.1.0
#31 0x00af2185 in Py_Main () from /usr/lib/libpython2.4.so.1.0
#32 0x08048582 in main ()
(gdb) n
216             for (prev = NULL; fbhp != NULL;
(gdb) l
211
212             if (fbhp == bhp)
213                     fbhp = SH_TAILQ_NEXT(fbhp, hq, __bh);
214             SH_TAILQ_REMOVE(&hp->hash_bucket, bhp, hq, __bh);
215
216             for (prev = NULL; fbhp != NULL;
217                 prev = fbhp, fbhp = SH_TAILQ_NEXT(fbhp, hq, __bh))
218                     if (fbhp->priority > bhp->priority)
219                             break;
220             if (prev == NULL)
(gdb) p prev
No symbol "prev" in current context.
(gdb) p fbhp
$1 = (BH *) 0xb7876138
(gdb) p bhp
$2 = (BH *) 0xb7876138
(gdb) n
218                     if (fbhp->priority > bhp->priority)
(gdb)
219                             break;
(gdb)
217                 prev = fbhp, fbhp = SH_TAILQ_NEXT(fbhp, hq, __bh))
(gdb)
216             for (prev = NULL; fbhp != NULL;
(gdb) p fbhp
$3 = (BH *) 0xb7876138
(gdb) p bhp
$4 = (BH *) 0xb7876138


Afterwards trying to `rpm -qv rpm' I got lockups, strace(1) showing the common
(still not fixed?):
...
open("/var/lib/rpm/Packages", O_RDONLY|O_LARGEFILE) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=35368960, ...}) = 0
futex(0xb7cd11a0, FUTEX_WAIT, 2, NULL <unfinished ...>


This bug could be probably split to more separate ones.

Comment 1 Jan Kratochvil 2007-01-06 11:06:39 UTC
Created attachment 144965 [details]
bzip(1) of the core dump of "yum update".

Comment 2 Jan Kratochvil 2007-01-06 11:17:28 UTC
Created attachment 144966 [details]
.tar.bz2 of /var/lib/rpm

Comment 3 Jan Kratochvil 2007-01-06 11:18:53 UTC
rpm version was unreachable before, forgot it:
rpm-4.4.2-36.el5.i386


Comment 4 Jan Kratochvil 2007-01-06 11:19:29 UTC
Created attachment 144967 [details]
rpm -qa

Comment 5 Jan Kratochvil 2007-04-09 09:49:55 UTC
These lockups do not occur on FC6 and IIRC they got even fixed on the later
RHEL5 snapshots.


Comment 6 Jeff Johnson 2007-04-09 13:00:14 UTC
The workaround is
    rm -f /var/lib/rpm/__db*

The likely cause is inconsistent data in the cache.

And detecting/correcting stale locks is fixed in rpm-4.4.8-0.4 in late November 2006.

Comment 7 Jan Kratochvil 2007-04-09 13:05:20 UTC
I am aware of `rm -f /var/lib/rpm/__db*' and the problem was reproducing itself
again and again despite such cache clears.
It is interesting FC6 rpm-4.4.2-32.x86_64 did not reproduce this problem so far
for me but it does not mean much, I also moved i686->x86_64 since that time.

Feel free to close it.



Note You need to log in before you can comment on or make changes to this bug.